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Cancer risk Mfimatlng method using sequence polymorphisms In a speelffe 
region of chromosome 19 

The present Invention provides methods and compositions for Identifying human 
subjects with an Increased risk of having or developing cancer. In particular, this 
invention relates to Oie Identification and characterization of polymorphisms In the 
human chromosome 19q, the region r located approximately 19q13«2-3 correlated 
with increased risk of developing cancsr and the rasponslveness of a subject to 
various treatments for cancer. 

Background 

DfMA polymorphisms provide an efftelent way to study the assodaiion of genes and 
diseases by analysis of linkage and linkage disequlllbnjm. With the sequencing of 
the human genome a myriad of hitherto unknown genetic polymorphisms among 
people have been detected. Most common among these are the single nucleotide 
polynnorphisms, also called SNPs, of which we now know several millions. Other 
examples are variable number of tandem repeat polymorphisms, insertions, dele- 
tions and btock modifications. Tandem repeats often have multiple different alleles 
(variants), whereas the other groups of polymorphisms usually Just have' two alleles. 
Some of these genetic polymorphisms probably play a direct role in the biology of 
the indlviduafs. including their risk of developing disease, but the virtue of the major- 
lly is that they can serve as markers for the surrounding DNA, and thus serve as 
leads during as search for a causative gene polymorphism, as substitutes in the 
evaluation of its role in health and disease, and as substitutes In the evaluation of 
the genetic constitution of Individuals. 

The association of an allele of one sequence polymorphism with particular alleles of 
other sequence polymorphisms in the sun^oundlng DNA has two origins, known in 
the genetic field as linkage and linkage disequilibrium, respectively. Linkage arises 
because large parts of chromosomes are passed unchanged firom parents to off- 
spring, so that minor regions of a cliromosome tend to flow unchanged from one 
generation to the next and also to be similar In dlflisrent branches of the same fern- 
ily. Linkage is graduelly emded by recombination occurring in the cells of the gerni* 
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one, but typically operates over multiple generations and distances of a number of 
million liases in ttie DNA. 

Unkage disequilibrium deals with whole populations and has its origin in the (distant) 
5 fbrefiather In whose DNA a new sequence polymorphism arose. The Immediate sur- 
roundings in the DNA of the fbrelather will tend to stay with the new allele for many 
generations. Recombination and changes in the composition of the population wiU 
again erode the association, but the new aiiele and the alleles of any other polymof 
phism nearby will otten be parUy associated among unrelated humans even today. A 

10 crude estimate suggests that alleles of sequence polymoiphisms with distances less 
that 10000 bases In the Di^ will have tended to stay together since modem man 
arose. Unkage dlsequllbMum in limited populations, for instance Europeans, often 
extends over longer distances. This can be the result of newer mutations, but can 
also be a consequence of one or more -bottieneciw- with small population sizes and 

1 5 considerable inbreeding In the history of the current populaUon. Two obvious possi- 
biilHes for "bottieneclts" in Europeans are the exodus from Africa and the lepopula- 
Hon of Europe after the last Ice age. 

Unkage disequilibrium is the results of many stochastic events and as such subject 
20 to statistical variation occasionally resulting in discontinuities, lack of a monotonic 
relationship between assoctetlon and distance and difierences between people of 
different ethnicity. Therefore, it is often advantageous to study more that one se- 
quence polymorphism in a given region. This also allows for further definition of the 
genetic surroundings of the biologically relevant polymorphism by combining the 
25 associated alleles of the different maricers Into a socailed haplotype. 

Humans in general carry two copies of each human chromosome in each ceil. There 
are exceptions to this rule, not relevant to this application. We therefore speak about 
genotypes i.e. the combined analysis of both chromosomes at a given sequence 
30 polymorphism. The resulting genofypes of a person, analysed tor instance on DNA 
from peripheral bkiod leukoc^s. are inherenUy very stable over time. Therefore, 
this type of analysis can be peri^ormed any Hme in the life of a person and wHI be 
applicable to this person for his or her entire life. By the same token such genetic 
analyses are ideally suited to predict future risks of disease. 

35 
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A variety of investigations suggest that many diseases in part are determined by the 
genetic constiluUon of the Individual. One group of genes in particular has been as- 
sociated with rare genet" c predispositions to cancer These are the genes involved 
in maintaining the Integrity of a persons DNA, the so-called DMA repair genes. One 
5 set of such genes are the XP genes which participate in nucleotide excision repair, 
and. when mutated, give rise to a 1000 fold ihcreased risic of getting skin cancer. For 
this reason we have previously Investigated single nucleotide polymorphisms In one 
DMA repair gene XPD for association with risic of skin cancer in a cohort of Cauca- 
sian Americans, and found that one allele of the sequence polymorphism called 
10 XPDe6 was associated with a moderately increased risk of getting basal cell carci- 
noma, the most common form of skin cancer. Later other groups have studied the 
association between sequence polymorpWsms in this and other DNA repair genes 
and various forms of cancer. Some have reported podtive results. 

15 Very little is known about the function of the gene RAI. It was cloned because Its 
protein product binds to and Inhibits RelA of the transcription regulator NF-kappaB. 

Summary of the Invention 

20 The present Invention relates in a first aspect to a group of nucleic acid sequences 
found to be associated with cancer. The invention further relates to transcriptional 
and translatlonal products of said sequence. An allele In ttie r region can be identi- 
fied as correlated with an increased risk of deveioping cancer, the prognosis of de- 
veloped cancer, and responsiveness to cancer treatment on the basis of statistical 

25 analyses of 0ie incidence of a particular allele in individuals diagnosed with cancer. 

Thus, In a first aspect the Invention relates to a method for estimating the cancer risk 
of an Individual comprising 

30 - providing a sample from said individual. 

- assessing In the genetic material Including human genes in said sample a se- 
quence polymorphism 

35 - In a region oorresponding to SEQ ID NO: 1, or a part thereof, or 
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- In a region complementary to SEQ ID NO: 1, or a part thereof, or 

- in a transoriptlon product from a sequence In a region corresponding to SEQ 
ID NO: 1 , or a part thereof, or 

■ translation product from a sequence In a region conespondbig to SEQ ID 
NO: 1 , or a part thereof. 

- obtaining a sequence polymorphism response. 

- estimating ttie cancer risk of said individual iKised on the sequence ptriymor. 
phlsm response. 

The estimation of ttie cancer risk of an individual can involve the comparison of Oie 
number and^or kind of pcdymoiphic sequences identified with a predetennined can- 
cer rfek profile. Such a profile can be based on statistical data obtained for a rele- 
vant reference group of tndlvlAjals. 
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The sequence of the r region is set fortii as SEQ ID N0 1, originating from ttte clon- 
ing of human chromosome 19q published as part of the contig NT_01 1109 in the 
. database of human sequences established by National Center for Biotechnology 
Information and located on tiie Internet at 
20 hecp; //www.nebi .n lM.nlh.qov/qenomByeruide/human/ 

The presence of an allele is detennlned by determining the nucleic acid sequence of 
all or part of the region according to standard molecular blc^ogy protocols well 
known In Uie art as described for example In Sambrook et al. (1989) and as set forth 
25 In ttte Examf^es provkted hertin or products of ttie nucleic add sequences. 

In particular, tiie nudeic acM molecules of tiie present invention r^rasent in a firet 
aspect nudeic add sequences fomiing part of Uie region r corresponding to position 
1522>37752 of SEQ ID NO: 1, and preferably to certain nudeic acid sequences 
30 wittiin ttia gene refened to herein as RA\. As demonstrated in tiie Examples pre- 
sented below, the RAl gene is in particular associated wHh human cancer diseases. 

Furthermore. Uia Invention relates to a metiiod for estimating ttie cancer prognosis 
of an Individual comprising 

35 
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- providing a sample from said Individual. 

• assessing in the genetic materia including human genes In said sample a se- 
quence polymorphism 

- in a region corresponding to 8EQ ID NO: 1 , or a part thereof, or 

- in a region cornplementaiy to SEQ ID NO: 1 . or a part thereof or 

- in a transcription product from a sequence In a region conesponding to SEQ 
iD NO: 1, or a part thereof, or 

- or translation product from a sequence In a region corresponding to SEQ ID 
NO: 1 , or a part thereof, 

- obtaining a sequence polymorphism response, 

• estimating the cancer prognosis of said individual based on the sequence poly- 
15 morphism response. 

The estimation of the cancer prognosis of an individual can involve the comparison 
of the number &n6/or kind of polymorphic sequences identified with a predetermined 
cancer prognosis profile. Such a profile can be based on statistical data obtained for 
20 a relevant reference group of individuals. 

Addrtlonally provided is a method of identifying a human subject as having an in- 
creased lii<8!ihoGd of responding to a treatment, comprlsfng a) conreiailng the pres- 
ence of an r region allele genotype with an increased liltellhood of responding to 
25 treatment; and b) determining ttie r region allele genotype of the subject, whereby a 
subject having an r region allele genotype correlated with an increased iilcelihood of 
responding to treatment is identified as having an increased lilcelihood of responding 
to treatment 

30 Thus, the present Invention also relates to method for estimating a treatment re- 
sponse of an Individual suffering from cancer to a cancer treatment, comprising 

- providing a sample firom said individual. 



8 



07/10 2002 15:12 FAX ^ (gOOg 

P887DK01 

6 

. assessing in the genetic material Including human genes in said sample a se- 
quence polymorphism 

- in a region conBsponding to SEQ ID NO: 1. or a part thereof, or 
5 - In a region compiementary to SEQ ID NO: 1 . or a part thereof, or 

- In a transcription product from a sequence in a region corresponding to SEQ 
10 NO: 1 , or a part tfiereof, or 

- or translation product from a sequence In a region corresponding to SEQ ID 
NO: 1, or a part thereof, 

10 - obtaining a sequence polymorphism response, 

- estimafing the individual's response to the cancer treatment based on the se- 
quence polymorphism response. 

15 The estimation of the Individual's response to cancer treatment can involve the 
comparison of the number and/or kind of polymorphic sequences identified with a 
predetermined cancer treatment response profile. Such a profile can be based on 
statistical data obtained for a relevant reference group of individuals. 

20 The invention also comprises primers or probes for use in the invention, as well as 
kits Including these. The primers and/or probes are preferably capable of hybridising 
to SEQ ID NO:1. or a part thereor. in particularly the r region, or a part thereof, un- 
der stilngent conditions. 

25 Furthennore, the invenUon also relates to cloning vectors and expression vectors 
containing the nucleic acid molecules of the invention, as well as hosts which have 
been transformed with such nuctefc add molecules, including celts genetically engi- 
neered to contain the nucleic add molecules of the Invention, and/or cells geneth 
cally engineered to express the nudeic acid molecules of the invention. The nudeic 

30 acids are preferably Isolated form the r region and preferably contain one or more 
sequence polymorphisms as described herein below in more detail. In addiUon to 
host cells and cell lines, hosts also Indude transgenic non-human animals (or prog- 
eny thereof). 
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In particular, tha present Invention is based on the discovery of the correlation with 
single nucleotide polymorphisms (SNPs) and/or tandem repeats In the r region and 
cancen Thus, SNPs have been found In the r region as shown in table 1. However, 
the present Invention Is not Omited to the SNPs shown In table 1, but does Include 
5 any SNP in the region. Tandem repeats have been found In the r region as shown In 
table 2. However, the present Invention is not limited to the tandem repeats shown 
In table 2, but does include any tandem repeat In the region. 

The term human Includes both a human having or suspected of having a cancer 
10 disease and an a-symptomatic human who may be tested for predisposition or sus- 
ceptibility to cancer At each posIUon the human may be homozygous for an allele or 
the human may be a heteroaygote. 
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Drawings 

Fig. 1 shows a subregion of chromosome igq 

fig. 2 shows odds ratios and p-values for Individual sequence variations In relation 
to n'sK of basal cell carcinoma 

Rg. 3 shows odds p-values for assoclafion of different sequence variations vrfth risic 
of basal cell carcinoma among psoriatic Danes 

Detailed description of the invention 

The present invention relates to a characterization of a person's present and/or fu- 
ture risk of getting certain fomis of cancer. The characterization Is based on the 
analysis of sequence polyntoiphlsms In a region of chromosome igq in the person. 

30 A number of polymorphisms in the chromosomal region igql 3.2-3 have been Iden- 
tilled and characterised. Surprisingly, the sequence polymorphisms with strongest 
association to disease appeared to be located outside XPD. More spedflcally, the 
sequences were located in a sub-region between XPD and ERCC1, and seemed to 
have a maximum In or around the gene RAI (See Example 1). For persons getting 

35 their skin cancer relatively eariy (before 50 yeara of age), it was found that predio- 
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Uons got better (Example 2} and when two sequence polymorphisms in RAI were 
combined, the prediction of early skin cancer got even better (Example 3). It was 
also possible to combine sequence polymorphisms in RAI with sequence polymor- 
phisms outside the region and get highly positive results (Example 4)« 



The region of chromosome lOq. more precisely the region located in 19ql3.2-3, with 
which the present invention te concerned, is depicted in Rgura 1 as It is presently 
known together with ttie presently known or suspected genes. The arrows Incflcate 
the directions of transcription of the genes. The absolute chromosome positions 
10 shown are from the particular build of NCBI's map of chromsome 19. and will proba- 
bly change with time. 

The region r stretches ftom the betf nning of, but not includingf the XPD gene, to 
approximately the end of ERCC1 and Includes the genes RAJ. LOCI 62878. and 
15 ASE-1 . More specifically r Is bounded by and includes the following two sequences: 
AGAACCCCCG CCCCTCCACC TCGTCTCAAA and TCCCTCCCCA GA- 
GACTGCAC CAGCGCAGCC. and Is defined by SEQ ID NO: 1. 



In the present context the region r means SEQ ID NO: 1 and complementary se- 
20 quence as well as transcriptional produ^ and translatlonal products thereof. 

One prefenred section of the region r stretches approximately from the end of 1^1 to 
the beginning of ASE-1 and Includes the genes RAI, LOC16297e, and ASE-1. More 
specifically, this section of r Is bounded by and includes the following sequences: 
25 GAAGTGA6CC AAGATCAC6C CACT6CACTC and GT6CCCACCT G6GCCAC- 
CA6 AAGGTGACAC. in the pnssent context the region £ means SEQ ID NO: 1 
bases 1522*37752 and complementary sequence as well as transcriptional products 
and translationai products thereof. 

30 Finally, in the claims the gene RAI is defined as including transcribed sequences of 
the gene plus a 1500 base upstream promoter region. More specifically RAf is 
bounded by and Includes the following sequences: CATAACCACA ATGATGAGCA 
TGTATTGAGT and AT6TT6TCCA GGCTGGTCTT GAAGTCCTGA. In the present 
context this section of the region i relates to SEQ ID NO: 1 liases 7761-22885 and 
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complementary sequence as well as transcripOonal products and translafional prod- 
ucts thereof. 

Modifications to the human genome map are known to occur from time to Ome. It fs 
therefore possible that the defining sequences quoted above will change slightly In 
future maps. 



Fragments or parts of the region r as used herein relates to any fragment of at least 
100 nudeic add redues In length, or mutlples of 100 nucleic acid residues In length, 
starting from SEQ ID NO: 1 position 1, 100, 200, 300, 400, 500, 600, 700, 800, 900, 
1000. 1100, 1200, 1300, 1400, 1500, 1600. 1700. 1800, 1900, 2000, 2100, 2200. 
2300, 2400» 2500. 2600. 2600, 2700, 2800, 2900, 3000. and so forth, each fragment 
starting position having an increment of 100 nucleic acid residues. Multiples are 
preferably multiples of e.g. 1, 2. 3, 4. 5. 6. 7. 8, 9, 10, 11. 12. 13. 14. 16. 18. 17. 18. 
19, 20. 21. 22, 23, 24, 25, 26. 27, 28. 29, 30, 31. 32. 33. 34. 35. 36. 37. 38. 39. 40, 
41. 42. 43, 44. 45. 46. 47, 48, 49 and 50. 

For fragments starting at position 1. the length of ssld fragments will thus be e.g, 
100. 200. 300. 400. 500. 600, 700. 800. 900. 1000. 1100. 1200. 1300. 1400. 1500. 
1600. 1700. 1800. 1900. 2000. 2100. 2200. 2300. 2400. 2500, 2600. 2800. 2700. 
2800. 2900, 3000. and so forth, using suitable multipii<»tors as listed herein above. 

For fragments starting at position 100, the length of said fragments will thus be e.g. 
100, 200, 300. 400. 500, 600. 700. 800. 900, 1000. 1100. 1200. 1300. 1400. 1500, 
1600, 1700, 1800, 1900. 2000. 2100. 2200, 2300. 2400. 2500. 2800. 2600. 2700, 
2800. 2900. 3000. and so forth, using suitable multlpllcators as listed herein above. 

For fragments starling at position 7700, the length of said fragments will thus be e.g. 
100, 200. 300, 400, 500, 600. 700. 800. 900. 1000, 1100. 1200. 1300, 1400. 1500. 
1600. 1700, 1800. 1900. 2000. 2100, 2200. 2300, 2400, 2500. 2600. 2600, 2700, 
2800, 2900. 3000. 3500. 4000, 4500. 5000, 5500. 6000, 6500, 7000. 7500. 8000, 
8500. 9000. 9500. 10000. 10500. 11000. 11500. 12000, 12500. 13000. 13500, 
14000. 14500. 15000. and so forth, using suitable multiplicators such as e.g. the 
ones listed herein above. 
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The nucleic acid sequences according to the present invention makes it possible to 
estimate cancer risk in an Individual by using sequence polymorphisms originating 
from a specific region of chromosome 19. 

Estimation of cancer risks has a number of important applications: 

(1) individuals with reasons to suspect that ttiey are at risk for getting cancer would 
be able to clarify their situation and, if posdble, take protective action. Alternatively, 
antl-canoer campafgns, companies, hospitals or other Institutions could offer a serv- 
ice to help people clarify their situation. It would for Instance be possible to test per- 
sons, when they got their first basal cell carcinoma, which is often recurrent and also 
is a moderate predictor for other cancers, if the persons were In a high-risk group, 
one could then advice them about, or they could of their own accord choose, riskr 
reducing behaviour, such as avoidance of excessive sun-exposure, abstaining ftom 
smoking etc. About 5 percent of the Danish population vAW at acme point In their life, 
get a basal cell carcinoma. 

(2) Anti-cancer campaigns, companies, hospitals or other institutions would be able 
to define relevant target subpopulationa and focus information on risk-reducing be- 
haviour on these persons. They might perhaps also be in a position to inform the 
remainder of the population that they need not worry. Lung cancer affects approxi- 
mately 10-15 percent of smokers and thus approximately 5 percent of the popula- 
tion, somewhat varying from country to country. Malignant melanoma, a sun- 
induced, ofleo lethal fomn of skin cancer, affects approximately 700 persons a year 
In Denmartc or afciout 1 percent of the Danish population. 

(3) The drugs used in cancer treatment are often carcinogenic themselves and indi- 
vidual responses to them vary considerably, both with respect to tolerance to the 
treatment and with respect to efficacy of the treatment. It is an obvious possibility 
that the region of chromosome ig here dealt with, which contains DNA repair genes 
known to modulate carcinogen responses, also modulates response to anti-cancer 
agents. Hence, analysis of the region may facilitate better choices of treatment for 
cancer, and/or help predict the fk/ture course of disease. 
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By sequence polymorphism Is understood any single nucleotide, tandem repeat. 
Insert, deletion or block polymorphism, which varies among humans, whether it is of 
biological Importance or not 

Position of sequence polymorphism In the i«gion r 

In one embodiment of the methods of the Invention, preferably the method for diag- 
nosis as described herein, one or more single nucleotide polymorphism(s) at a pre- 
detemilned position in the region r (SEQ ID NO:1) are identified and used for e.g. 
cancer risk profiling and/or cancer treatment response proflilng. Presently preferred 
single nudeoiide polymorphlsm(s) are listed In Table 1. However, the present Inven- 
tion relates to any SNP In the r region. 

Table 1 



35 



identHlcaHon in cBiSNP* 
rs#3138378A/G 
rs#313B376 G/T 

r5#209725 C/A ambigouous locadon 

rs#2377328 CfT 

rs#6g66 A/T 

rs<^17154 A/C 

rsfP2017104 NG 

r^070830 T/G 

rs#1970764 A/G 

r8#2226949 T/G 

rsms&Asr err 

rs«2336218 C/A 
rB#766934 A/0 
rs#928911 
rs#1005165 CfT 
rs#1005ie6 C/T 
rs09B7591 A/G 
rs#1046282 T/C 



Position In SEQ ID NO: 1 

137 

235 

» 

7199 

7887 (-(RAIee) 
12115 
12190 
14575 

15798 («l^li1) 

32035 

32446 

32447 

32481 

32785 

33974 

34119 

34858 (sASE-1e1) 
35596 
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rs#2013521 fiJT 
rs#7354B2 A/C 
ns#762562 fiJG 



36254 
36926 
37267 



rs#2336919 ambiguous location 
r^43571 C/Q 



37786 
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^ dbSNP is the datatose over SNPs estabHslied by the N^nai Center for Biotech- 
noiogy information and located on the internet at httD:/A(www.ncbl.nim.nih.oovaNP/. 

In another embodiment of the Invention preferably the method for described herein 
Is one In wtdch ttie tandem repeat Is at a position as described In Table 2: 

TBble2 

Identification in unfSTS' 

D19S906 

STS-W67936 

D19S543 

D1 98393 

STS-R48186 

GDB:181915 

RH47033 

6DB:190019 

^ UniSTS Is a database of unique sequence lag sites established by Nafional Center 

for Biotechnology Information and located on the Internet at 

http ; // www, ncbi .nlm ■ nih . gov/entreg/ouery . gcqi?db»unieta 

In another embodiment of the Invention, the method for diagnosis described herein 
is preferably orte in virtilch the sequence polyman»hism is in region r. Testing for the 
presence of the 1^1 gene allele is especially prefened because, without wiping to 
be bound by theoretical considerations, of Its association with increased risic of can- 
cer (as explained herein). 
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The sequence polymorphism of the invemfon comprises at least one base differ- 
ence, such as at least two base differences. As described above the sequence poly^ 
morphtsm comprises at least one single nucleotide polymorphism, such as at least 
two single nucleotide polymorphisms. Also, the sequence polymorphism comprises 
5 at least one tandem repeat polymorphism* such as at least two tandem repeat poly- 
morphisms. 

Also, the sequence polymorphism may be a combination of single nucleotide poly- 
morphism end tandem repeats. 
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The status of the individual may be determined by reference to allelic variation at 
one. two. three, four or nrore of the above loci. 

Cell sample 



The cell sample used In the present invention may be any suitable cell sample ca- 
pable of providing the genetic material for use In the method. \n a preferred em- 
bodiment, the ceil sample Is a blood sample, a tissue sample, a sample of secrefion, 
semen, ovum, a washing of a body surface (e.g. a buccal swap), a clipping of a 
20 body surface (hairs, or naiis), such as wherein the cell Is selected from white blood 
cells and tumour tissue. 

it will be appreciated that the test sample may equally be a nucleic acid sequence 
corresponding to the sequence in the test sample, Oiat is to say that all or a part of 
25 the region in the sample nucleic add may firstly be amplified using any convenient 
technique e.g. PGR. before use in flie analysis of variation in the region. 

Detection methods 

30 Detection may be conducted on the sequence of SEQ ID NO: 1 or a oompfementary 
sequence as well as on translational (mRislA) and transcriptional products (polypep* 
tides, proteins) therefrom. 

It will be apparent to the person skilled In the art that there are a large number of 
35 analytical procedures which may be used to detect the presence or absence of vari- 
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ant nucleotides at one or more of positions mentioned herein in the r region. IMuta- 
tions or polymorphisms within or flanking the usalOQ can be detected by utilizing a 
number of techniques. Nucleic acid from any nucleated cell can be used as the 
starting point for such assay techniques, and may be Isolated according to standard 
nudelc acid preparation procedures that are well known to those of skill in the art. In 
general, the detection of allelic variation requires a mutation discrimination tech^ 
nlque, optionally an ampllflcdtlon reacfidn and a signal generation system. Table 3 
lists a number of mutation detection techniques, some based on the PCR. These 
may be used In combination with a number of signal generation systems, a selection 
of which Is listed In Table 4. Further amplification techniques are listed in Table 5. 
Many current methods for the detection of elieiic variation are reviewed by Nollau et 
af., ain. Chem. 43, 1 1 14-1 120. 1897; and in standard textbooks, for example '•Labo- 
ratory Protocols for Mutation Detection", Ed. by U. t^ndegren. Oxford University 
Press, 1996 and "PGR*, 2.supjid Edib'on by Newton & Graham. BIOS Sdentino 
Publlshera Limited, 1997. 



Table 3 



' Abbreviations: 
20 ^ALEX.TM. 

APEX 

fiiRMS .TM, 

b-DNA 

CMC 
25 bp 

COPS 

DGGE 

FRET 

LCR 
30 MASDA 

NASBA 

OLA 

PCR 

PTT 
35 RFLP 



Amplification refractory mutation system linear extension - 

Arrayed primer extension 

Amplification refractory mutation system 

Branched DNA 

Chemical mismatch cleavage 

base pair 

Competitive oligonucleotide priming system 

Denaturing gradient gel elec^phoiesis 

Fluorescence resonance eneigy transfer 

Ugase chain reaction 

Multiple allele specific diagnostic assay 

Nucleic acid sequence based amplification 

Oligonucleotide ligation assay 

Polymerase chain reaction 

Protein truncation test 

Restriction fragment length polymorphism 
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SOA 

SNP 

SSCP 

SSR 

TGGE 



Strand dJsplacemant ampHflcaUon 

Single nucleotide polymorphisnn 

Single-strand confbmnation polymorphism analysts 

Self sustained replication 

Temperature gradient gel electrophoresis 



Table 4 Uludtiates various mutation detection techniques capable of being used for 
SNP detection. 



General techniques: DMA sequendng, Sequendng by hybridisation, SNAPshot 

Scanning techniques: PJT*. SSCP. DOGE, TGGE. Cleavase. Heterodupfex analy- ' 
SIS, CMC, Enzymatic mismatch cleavage 

Hybridisation Based techniques 

Solid phase iqrbridisation: Dot blots, MASDA, Reverse dot blots, Ongonucleotlde 
arrays (DMA Chips) 

Solution phase hybridisation: Taqman.TIV{.-U.S. Pat. No. 5,210,015 & 5,487.972 
(HoHmann-l^ Roche), Molecular Beacons-TyagI et al (1996). Nature Biotechnol- 
ogy, 14. 303; WO 95/13399 (Public Health Inst. New York), Lightcycler. optionally in 
combination with FRET. 

Extenston Based: ARMS.TM-, ALEX.TM.-European Patent No, EP 332435 81 
(Zeneca Limited), COPS-Gibbs et al (1989). Nucleic Acids Research, 17, 2347. 

Incorporation Based: Mini-sequencing, APEX 

Restriction Enzyme Based: RFLP, Restriction site generating PCR 

Ligation Based: OLA 



Table 4 
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Other: Invader assay 

Various Signal GeneraUon or Detection Systems is listed below: 

Fluorescence: FRET, Fluorescence quenching, Fluorescence polarisaHon-Unlted 
Kingdom Patent No. 2228998 (Zeneca Limited) 

Other: Chemilumfnescence, Electnochemlluminescance, Raman. Radioactivity, Col- 
orimetrIc« Hybridisation protection assay, Mass spectrometry 

Table 5 illustrates examples of further ampliflcatlon techniques. 
Table 5 

SSR, NASBA, LCR. SDA, b-DNA 

Prefen^ed mutation detecOon techniques Include ARMS.TM.. ALEX.TM.» COPS, 
Taqman. Molecular Beacons, RFLP, and restriction site based PGR and FRET 
techniques. 

Particulariy prefen-ed methods include FRET; taqman, ARMS.TM. and RFLP based 
methods. 

in a prefenned embodiment, mutations or polymorphisms can be detected by using a 
mlcroassay of nudeic ac[d sequences Immobilized to a substrate or "gene chip" 
(see, 6.g. Cronin, et al., 1996, Human Mutation 7:244-255). 

Further, Improved methods for analyzing DNA polymorphisms, which can be utilized 
for the IdentificaUon of region r specific mutations, have been described that capital* 
Iza on the presence of variable numbens of short, tandemly repeated DNA sequen- 
ces between the restriction enzyme sites. For example, Weber {U.S. Pat No. 
5,075,217) describes a DNA marker based on length polymorphisms In blocks of 
(dC-dA)n-(d6-dT)n short tandem repeats. The average separation of (dC-dA)n-(dG- 



19 



07/XO 2002 15:16 FAX ^ p^H ^^^^ 

P687DK01 

17 



10 



dT)n blocks is estimated to be 30,000-60,000 bp. Maricers that are so closely 
spaced exhibit a high frequency co-Inheritance, and are extremely useful in the 
Identification of genetic mutations, such as, for example, mutations within the RAI 
gene, and the diagnosis of diseases end disorders related to RAI mutations. 

Also. Caskey et al. (U.S. Pat No. 5,364.759) describe a DNA profiling assay for 
detecting short tri and letra nucleotide repeat sequences. The process includes exr 
tracting the DNA of interest, such as the RAI gene, amplifying the extracted DNA, 
and labelling the repeat sequences to fonn a genotypic map of the individual's DNA. 



The level of RAI gene expression can also be assayed. For example, RNA from a 
cell type or tissue known, or suspected, to express the RAI gene, such as brain, 
may be isolated and tested utilizing hybridization or PGR tec^iques such as are 
described, above. The Isolated cells can be derived finom cell culture or from a pa- 
15 tient. The analysis of cells taken from culture may be a necessary step In the as- 
sessment of cells to be used as part of a cell-based gene therapy technique or, al- 
ternatively, to test the effect of compounds on the expression of the RAI gene. Such 
analyses may reveal both quantitative and qualitative aspects of the expression 
pattern of the RAI gene, Including activation or. Inactivation of RAI gene expression. 

20 

In one embodiment of such a detection scheme, a cDNA molecule Is synthesized 
from an RNA molecule of Interest (e.g., by reverse transcription of the RNA mole- 
cule Into cDNA). A sequence within the cDNA is then used as the template for a 
nucleic add amplification reaction, such as a PGR amplification reaction, or the lil^e, 

25 The nucleic acid reagents used as synthesis initiation reagents (e.g., primers) In the 
reverse transcription and nucleic acid amplification steps of this method are chosen 
from among the RAI gene nucleic acid reagents described above. The preferred 
lengths of such nucleic acid reagents are at least 9-30 nucleotides. For detection of 
the amplified product, the nucleic add ampliftcatlon may be performed using radlo- 

30 actively or non-radioactively labeled nucleotides. Alternatively, enough amplified 
product may be made such that the product may be visualized by standard ethidlum 
bromide staining or by utilizing any other suitable nudelc add staining method. 

Additionally, It Is possible to perfomi such RAI gene expression assays "in situ", I.e., 
35 direcUy upon tissue sections (fixed and/or frozen) of patient tissue obtained from 
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biopsies or resections, such that no nucleic add purification Is necessary. Nuctetc 
acid reagents such as those described above may ba used as probes and/or prim- 
ers for such In situ procedures (see, for example, Nuovo. Q. J„ 1992, "PGR In Situ 
Hybridization: Protocols And Applications*. Raven Press, NY). 

5 

AltemaHvely. If a stifRdent quantify of the appropriate ceUs can be obtained, stan- 
dard Northern analysis can be perfonned to determine the level of mRNA expres- 
sion of the RAt gene. 

1 0 Activity of the gene 

Another method for detecting sequence polymorphism is by analysing the activity of 
gene products resulting finom the sequences. Accordingly* in one embodiment the 
detection uses the acOvlty of the RAl gene product as compared to a reference in 
15 the method. In particular If the activity of the genes are decreased or increased by at 
least or about 50 %, such as at least or about 40%, for example at least or ebout 
30%, euch as at least or about 20%, for example at least or about 10%. such as at 
least or about 10%, for example at least or about 5%. such as at least or about 2%. 
It Indicates a sequence polymorphism In the gene. 



20 



(Mutations outside the region 



The present Invention may combine the result of sequence polymorphism within the 
region r with sequence polymorphism outside the region In onJer to increase the 
25 probabili^ of the con-elation. 

Primers 

The primers nudeotlde sequences of the InvenHon further Indude; (a) any nudeo- 
30 tide sequence that hybridizes to a nucleic acid molecule of the region r or Its com- 
plementary sequence or RISIA products under stringent conditions. e.g., hybridization 
to fliter-bound DMA in 6x sodium chloride/sodium dtrate (SSC) at about 45«C f6I- 
lowed by one or more washes in 0.2x SSC/0.1% SDS at ebout SO-^S'^C. or (b) under 
highly stringent conditions, e.g.. hybridization to filter-bound nudeJc add In 6x SSC 
35 at about 45**C followed by one or more washes In O.lx SSC/0.2% SDS at about 
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eSHi, or under other hybridization conditions which are apparent to those of sidu In 
the art (see. for exampie, Ausubel F.M, et al., eds., 1989. Current Protocois In Mole- 
cular Biology, Vol. i. Green Publishing Associates, Inc., and John Wiley & sons, Inc., 
New York, at pp, 6.3.1-6.3.6 and 2.10.3). Preferably the nucleic acid molecule that 
hybridizes to the nucleotide sequence of (a) and (b), above, is one that comprises 
the complement of a nucleic acid molecule of the region r pr a complementary se- 
quence or RNA product thereof, in a preferred embodiment^ nucleic add molecules 
comprising the nucleotide sequences of (a) and (b), comprises nucleic add mole- 
cule of RAI or a complementary sequence or RNA product thereof. 

Among the nucleic acid molecules of the invention are deoxyollgonucleotldes ("oli- 
gos") which hybridize under highly stringent or stringent condiHons to the nudelc 
add molecules described above. In gerieral, for probes between 14 and 70 nucleo- 
tides in length the melting temperature (TM) is cslculated using the ibnnute: 

Tm(«C)=B1.5+16.6(log (monovalent cations (molar)])+0.41(% G-«*CH50Q/N) 

where N is the length of the probe. If the hybridlzaUon Is carried out in a solution 
containing fomnamlde, the melting temperature Is calculated using the equation 
Tm(<'C)»B1.S-i-i6.6(log[monovalent cations (molar)]>i-0.41(% 6-k;H0.61% formam- 
ide)-(500/N} where N Is the lengSi of the probe, in general, hybridization is carried 
cut at about 20-25 degrees below Tm (for ONA-DNA hybrids) or 10-15 degrees be- 
low Tm (for RN A-DNA hybrids). 

Exemplary hIgWy stringent conditions may refer, e.g.. to washing In 6x SSC/0.05% 
sodium pyrophosphate at 37»C (for about 14-base ollgos), 48«C (for about 17-bdse 
ollgos), SS^'C (for about 20-ba8e ollgos), and 60«C (for about 23-base ollgos). 

Accordingly, the Invention further provides nucleotide primers or probes which de- 
tect the r region polymorphisms of the Invention. The assessment may be conducted 
by means of at least one nudelc acid primer or probe, such as a primer or probe of 
DNA, RNA or a nucleic acid analogue such es peptide nudelc add (PNA) or locked 
nudelc acid (LIMA). The nucleotide primer or probe is preferably capable of hybrid- 
ising to a subsequence of the region comespondlng to SEQ ID NO: 1, or a part 
thereof, or a region complementaiy to SEQ ID NO;l. 
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According to one aspect of the present Invention there Is pro^^ded an allele-specific 
oligonucleotide probe capable of detecting a r region polymorphisn) at one or more 
of positions In the r region as defined by the posifions in SEQ 10 NO: 1 . 

5 

The aUela^peclfic oligonucleotide praba Is prefarabiy &>S0 nucleotides, more pref- 
erably about 5-35 nucleotides, more praferabiy about 5-30 nucleotides, more pref- 
erably at least 9 nudeotides. 

10 The des^n of such prabes will be apparent to the molecular biologist of ordinary 
skill. Such probes are of any convenient length sudti as up to 50 bases, up to 40 
bases, more oonvenienUy up to 30 bases In length, sudh as tor «(ample 8-25 or 8- 
15 bases in iengOi. In general such probes vm comprtee base sequences entirely 
complementaiy to the corresponding wild type or variant locus In the region. How- 

15 ever, if required one or more mismatches may be introduced, provided that the dis- 
criminatory power of the oligonucleotide probe Is not unduly affected. The probes of 
the Invention may carry one or more labels to facilitate detection. 

In one embodiment, tiie primeis amVor probes are capable of hybridizing to a eub- 
20 sequence selected from the group of subsequences below, wherein ttie polymor- 
phism is denoted as for example T/C: 

1 . GCTCTGAAAC TTACTAGCCC(A/G)GTATTTATGG AGAGGCATTT 

2. GTGGTCAAAT TCTCATTCAT CGTGG (T/C) CCAGGCAAGC 
25 ACACTTCCTC 

3. ACCCTGAGGT GAGCACCTGT TCCTT(C/T) TCCTTGCCCT TAGCCCA- 
GAG GTAGA 

4. GG6CAGGG6T TTGTGCCTCC AATGA (G/A) CACAAGCTCC 
CCCTGCCCCC CAACT 

30 5. CCTQGCGGTGGCCGTCACCAGCTTT(T/C)GGGGGTGTTT 

GGGAAGCTGG 

6. CTCCAGCCCC ACTGTTCCCT (A/G) GGCCCTATTG GTCCCCCTGG 

7. ACAAGGAGGA GGCAGAAQTG AGGTT (G/C) AAACCCACTG CCCAATC- 
TTA 

35 8. CCAACACGGT6AAACCCCGTCTGTA(T/C)TAAAAATACAAAAATTAGCC 
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9. AATCCA6GAC CCCATAATCT TCCGT (Cn") ATCTAAAACA ATA- 
ATGGTGA 

10. CCCAA6GG6G CGAGGGGAGG GTGAA (A/6)GGGTGGGAC6 
GGGGCAGCCG 

5 11, GAAGTGAGAA GGGGGCTGGG GQTCG (G/-) CGCTCGCTAG 

CGGGCGCGGG 

12. CQCACGCGCA GTATCCCGAT TGGCT (C/G)TGCCCTAGGG QATT- 
GACGGG 

13. AACTCCTQQ6 TTCGATCAAT ACTCA (QACA/-) ATCTTOGCAG 
10 GCGCAQQAGO 

14. GCTGGGATTA CAGGCTTGAG CCACC (A/G) CGCCCGGCCT 
GCAAAGCCAT 

15. TTTTGTATCT TTAQTAGAGA CAGG (TJG) TTTCTCCATG TTGGTCA6GC 
16. 6CCTCAGCCT CCCGA6TAGC TGAGACT (C/A) CAGGTGCCCG CCAC- 

15 CACGCC 

17. TGAAATTGTA GGTTGAGAG6 CCAGGCG (C/T) GGTGCTCACG 
CCTGTAATTT 

18. GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

19. CGGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCC6 GGTGTAGC6G 
20 20. GGGA6GCTCG AGGCG6GC (A/G) GATTGQMGA GCTCA6GATT 

21 . TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

22. TGCAGTGAGC TGAGATCGC (A/G) CCAGTGCAGT CCA6CCTGGG 

23. TCTTAG6ACG CATGGGGGT (T/G) GAGAG/^CGG GGAGATAGAC 

24. CTGGGTTCTA QAACTACC (C/T) ATGCAAACCC AGGTGTTTCC 
25 25. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) T6GA/VACCCA 

GCTGTTTCCC 

26. GCTGITTCCC ACGCCATAAG GCA (A/G) TA6GGGAGCC 
CACCTCCGCC 

27. GACCTAGAAG ATCGQTCGAQ A (C/T) AGCAGCTTGA GGCTGQCAGQ 
30 26. CTGGCCAGGA ATGCAGTCGG GTCAC (C/T) CTGTCTAGCC 

ACCGTCTCGC 

29. GGGAGGAGTC 6CCGATCAGG (OT) CCCTTCCTGA AAGTCATCGA 

30. GCAGCCCGGG CTACAGG6TT (A/G) CCTGA6GTOT GGGTCCCAGG 

31 . TAGAAATACT AACAAAG6GC (T/C) GTGGGTTTCT CCCCCTGCTT 
35 32. ACAGGAGAGG GAAGGTTmTG (A/T) i III I Ul H Gi 1 1 M 1 1 H 
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33. GAAGAGGAAG AAGCXCAAAG GGA (A/C) AGAAACCTTC GAGCCA. 
GAAG 

34. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAG6- 
CAGCT 

5 35.TTGA6ACTCTCTGTTTGAT(A/G)CrrTCACTCAGAAGGTGCTTC 

36. AGGCXJAGGCT CCTGCTG6CT 6 (C/G) GCT6GTGCAG TCTCTGGGOA 

37. CCCCTATACC CTCAAGCAT (C/T) TATCCATTGA GTTACAAACA 
38- ACCATCCCCC GCCTrCCGTT(A/C) GTCCGQCCCC CQAOQCTAGC 

10 

In another embodiment, the primers and/or probes are capable of hybridizing to a 
subsequence selected from the group of subsequences below: 

1. TGAAATTGTAGGTTGAGAGGCCAGGCG(C/T)6GTGCTCACG 
15 CCTGTAATTT 

2. GTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

3. CCGTCTCTAT TAAAAATATA AAA <A/C) AATTTAGCXJG GGTGTAGCGG 

4. GGOAGGCTCGAGGCG6GC(A/G)GATTGCATGAGCTCAGGATT 

5. TCCCAAGTTT CAGGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 
20 6. TGCAGTGAGCTGAGATCGC(A/G)CCACTGCACTCCAGCCTGGG 

7. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

8. CTGGGTTCTA GAACTACC (C/T) ATGCAAACCC AGCTGTTTCC 

9. ATTCT6CCCT6GGTTCTAGAACTACCT{C/A)TGCAAACCCA 
QCTGITTCCC 

25 10. GCTGTTTGCC ACCGCATAAG GCA (A/G) TAG6GGAGCC 

CACCTCCGCC 

11. GACCTAGAAQ ATCGQTCGAQ A (CAT) AGCA6CTTGA GGCTGGCAGG 

12. CTGGCCAGGA ATGCAGTCGG GTCAC (CfT) CTGTCTAGCC 
ACCGTCTCGC 

30 13. 6GGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 

14. 6CAGCCCGGG CTACAGGGTT (A/G) CCTGAGGTQT GGOTCCCAGG 

15. TAG/VAATACT AACAAAGGGC (T/C) GTGGGTTTCT CCCCCTGCTT 

16. ACAGGAGAGG GAAGGTTTTTTG (ATT) imilM II G IHIIII1I 
17- 6/VAGAGGAAG /\AGCCC/VAAG GGA (/VC) AG/WKCCTTC GAGCCA- 

35 GAAG 
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18. GCGCCTCAAC AGCCAGAAGO AGC6 (A/G) AGCCTCAGGC CCAG6- 
CAGCT 

In yet another embodiment, the primers end/or probes are capable of hybridizing to 
5 a subsequence selected from the group of subsequences below 

1 . QTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGGACTTAAT 

2. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGT6TAGCGG 

3. 6GGAGGCTCG AGGCOGOC (A/G) GATTGCATGA GCTCAGGATT 
10 4. TCCCAAGTTTCAGGGCCCAA(T/G)ATTCTCAAATCACAGGATTC 

5. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

It Is. prefsired in one embodiment that at least one sequence polymorphism is as*, 
sessed In a region con^sponding to SEQ ID NO: 1 position 1521-37792 ©, fndud. 
15 ing at least one sequence polymorphism assessed In a region corresponding to 
SEQ lO NO: 1 position 7760-22885. 

In another embodiment, the methods of the invention relates to at least one se-. 
quence polymorphiam is assessed in a region corresponding to SEQ-ID NO: 1 posi- 
20 • tion 34391-37683. ending with the coding region of A8E-1 (cagcdgtgtag). where tag 
is the stop codon. 

In a preferred embodiment Ihe primers or probes are selected from one or more of 
the following: 

25 

TGGCTAACACGGTGAAACC(SEQ ID NO:7) 
GGAATCCAAAGATTCTATGATGG(SEQ ID NO:8) 
GGGAGGCGGAGCTTGCAGTGA (SEQ ID N0:9) 
CTGAGATCGCACCACTQCAC (SEQ ID NO:10) 
30 GGTTTTCTGCTCTGCACACG (SEQ ID NO:1 1 ) 
CCTTTCTCCTTCCACCAACG (SEQ ID NO:12) 
CGGGCTACAGGGTTACCTGAG (SEQ ID NO: 13) 
TCTGCAACCTGGTGCGAGCAGC (SEQ ID NO:14) 
CCTACCACCATCATCACATCC (SEQ ID NO:15) 
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GCCTTQCCAAAAATCATAACC (SEQ ID NO:16) 
CCTCTCCCCAATTAAGTGCCTTCACACAGC (SEQ ID HOiiT) 
AGCCAGGGAGGTTGAGGCT (SEQ ID NO:18) 
AGACAGOCCTGAATCAGCAC (SEQ ID NO:19J 
5 GCAATGAGCCGAGATAGAA (SEQ ID NO:20) 
TGGCTAGCCCATTACTCTA (SEQ ID NOai) 

AccordJng to another aspect of the present Invention there is provided a diagnostic 
nodelc add primer capable of detecting a r region polymoiphlsm at one or more of 
10 positions In the r region as deftied 1^ the In SEQ 10 NO; 1. 

^ The primer or probe may be a diagnostic nucleic add primer defined as an aUele 

specific primer, used, genefaily together with a constant primer, in an ampHficatlon 
reaction such as a PGR reaction, which provides the discrimination between alleles 
15 through seledive amplification of one aiieie at a particular sequence position. The 
diagnostic primer Is preferably 5^ nucleotides, more preferably about 5-35 nudeo- 
tides, more preferably about 5^ nudeotldes. mora preferably at least 9 nucleo- 
tides. 

20 In accordance with the present Invention diagnostic primers are provided, compris- 
ing the sequences set out below as well as derivatives thereof wherein about 6-8 of 
the nudeotldes at the 3* iennlnus are identical to the sequences given below and 
Wherein up to 10, such as up to 8. 6, 4. 2. or 1 of «ie remaining nudeotldes may be 
varied without significantly affecting the propeiQes of the diagnostic primer. Con- 
^ 25 wnlenay, the sequence of the diagnostic primer Is as written below. 

Furtheimore. as described above at least two sets of primer(8) and/or probe(s) may 
be combined In the method thereby Increasing the correlation probabnity. This sec- 
ond or other set of primer(s) andAor proba(s) may be a nudeofide or nudeotide 
analogues hybridising to a region within the region r or io a sequence different fhim 
tiie region r. Said sequence different from ttie region r Is preferably a region In 
diromosome 19, preferably in diromosome 19q. In particular such second or other 
primer or probe may be selected from one or more of the sequences below, or the 
complementary strands: 
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GCCCX:GTCCCAGGTA (SEQ id N0:21) 
AGCCCCAAGACCCTTTCACT (SEQ ID NO:22) 
GTCCCATAQATAG6A6TGAAAG (SEQ ID NO:23) 
CCCTAQGACACAG6AGCACA (SEQ ID NO:24} 
TTGTQCTTTCTCTGTOTCCA (SEQ ID NO:25) 
TATCAGAAAAGGCTGGA6GA (SEQ ID N026) 
GAGTGGCT6GGGAGTAGGA (SEQ ID NO:27) 
GCCAAGCAGAAGAGACAAA (SEQ ID NO:28) 
CCTCAGATGTCCTCTGCTCA (SEQ ID NO:29) 
GCCACAGCCCCAGCAAGTAG (SEQ ID NO:30) 
AGGACCACAGGACACGCAGA (SEQ ID NO:31) 
CATAGAACAGTCCAGAACAC (SEQ ID NO*^) 
TTAGCTTGGCACGGCTGTCCAAG6A (SEQ ID NO:33) 
ACAGAATTCGCCCCGGCCTGGTACAC (SEQ ID NO:34) 
TTGAAACTGGAACrCTGAGAAGG (SEQ ID NO.'SS) 
TGGTGGATGGTGTGAAGCA (SEQ ID NO:36) 
CGTTTCTCCAACTTCTTCTCCATTTCCACC (SEQ ID NO:37) 
GGGQATCATGTCGTCAATGGACT(8EQ ID NO:38) 
ATGCCCT6TAGGTTCAATGG (SEQ ID NO:39) 
TGGAQGTCTTTAGQGGCTTG (SEQ ID NO:40) 
GGCTGGTCCCCGTCTTCTCCTTCC (SEQ ID NO:41) 
TCTCTGTTGCCACTTCAGCCTC (SEQ ID NO:42> 
GTCCTGCCCTCAGCAAAGAQAA (SEQ ID NO:43) 
TTCTCCTGCGATTAAAGGCTGT (SEQ ID N0:44) 
ATCCTGTCCCTACTGGCCATTC (SEQ ID NO:45) 
TQTGGACGTGACAGTGAGAAAT(SEQ ID NO:46) 
TGGAGTGCTATGGCACGATCTCT (SEQ ID NO:47) 
CCAT6GGCATCAAATTCCTGGGA (SEQ ID NO:48) 
^CACCTGGCTCATTTTTGTAT (SEQ ID NO:49) 
TCATCCA6GTTGTAGATGCCA(SEQ IDNO:50) 
AGGCTCAACAAGGAAAAATGC (SEQ ID NO:51) 
GCTAGACAGTCAAGGAGG6ACG (SEQ 10 NO:52) 
AAAGGGT66GTGTGGGAGACATTGG (SEQ ID N0:63) 
AAACCAACCTAGGCACCCCAAA(8EQ ID NO:54) 
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CAGTGTCCAAAGAGCACC (SEQ ID NO:55) 
CTACCCCTTTAGCGACC (SEQ ID N0:56) 
TCCTGCCCCCAGAGCGTCACC (SEQ ID NO:67) 
GTACGGTCCACATAATTTTGQAGGA (SEQ ID NO:58) 
5 CGACGAACTTCTCTGAAGCGAA (SEQ ID NO:59) 
AGCGACACGGGCATCTGG (SEQ ID Na60) 
ATGAQCGTCCACCTCCTGAACC (SEQ ID ISIO:61) 
AGGCAGCAGCATCGTCATCCCC (SEQ ID NO.-62) 
TGCATAGCTAGGTCCTGC (SEQ ID NO:63) 
10 AACTGACRAAACTAGCTCTATGGGQTGQTGCC6CA (SEQ ID NO:84) 
CTGGCTCTGAAACTTACTAGCCC (SEQ ID NO:65) 
GCTGGACTQTCACCGCATG (SEQ ID NO:66) 
GGAGCA6G6TT6GCGTG (SEQ ID NO:67) 
TGCCCTCCCAGAGGTAAGGCCT (SEQ ID NO:6B) 
IS CCCTCCCGGAGGTAAGGCCTC (SEQ ID NO:69} 
.GATCAAAGAGACAGACGAGC (SEQ ID NO:70) 
GAA6CCGAG6AAATGC (SEQ ID NO:71) 
GGACGCCCACCTGGCCAACC (SEQ ID NO:72) 
C6TGCTGCCCAACGAAGTG (SEQ ID NO:73) 



The primers and probes may be manufactured using any convenient method of 
synttie^. Examples of such methods may be found In standard textbooks, for ex- 
ample -Protocols for OJigonudeotides and Analogues; Synthesis and Propertles." 
26 Methods in Molecular Biology Series; Volume 20; Ed. Sudhir Agrawal. Humana 
ISBN: 0-89603.247-7; 1993; 1.sup.st Edition. If required the primer(s) and piobe(s) 
may be labelled to facilitate detection. 



KK 



According to another aspect of the present invention, there is provided a diagnostic 
idt comprising at least one diagnostic primer of the invention and/or at least one al- 
lele-8peciflc oligonucleotide primer of the invention. 
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The diagnostic kits may comprise appropriate packaging and Instnjctions for use in 
\he methods of the invention. Such kits may further comprise appropriate bufier(s) 
and polymerasets) such as themiostable potymerasee. for example taq polymerase. 

5 Preferred kits can comprise means for amplifying the relevant sequence such as 
primers, polymerase, deoxynudeotldes, buffer, metal ions; and/or meana for dis- 
criminating the polymorphism, such as one or a set of probes hybridising to the poly^ 
morphic site, a sequence reaction covering the polymorphte site, an enzyme or an 
antibody; and/or a secondary amplification system, such as enzyme-conjugated 
10 anUbodies. or Huorescent antibodies. The klt-of^aarts preferably atso comprises a 
detection system, such as a fluorometer. a Him. an enzyme reagent or another 
highly sensitive detection de\4ce. 



The methods described herein may be performed, fbr example, by uUlizing pre- 
packaged diagnosUc kits. The invention therefore also encompasses kits for detect- 
ing the presence of a polypeptide or nucleic acid of the Invention In a biological 
sample (l.e.. a test sample). Such kits can be used, e.g., to determine if a subject Is 
suffering from or is at Increased ifsk of developing a disorder associated with a dis- 
order-causing allele, or aberrant expression or activity of a polypeptMe of the inven- 
20 tlon. For example, the kit can comprise a labeled compound or agent capable of 
detecting the polypeptide or mlWA or DNA or RAI gene sequences. e.g.. encoding 
the polypeptide in a biological sample. The kit can further comprise a means for de- 
termining the amount of the polypeptide or mRNA in the sample (e.g., an antibody 
which binds the polypeptide or an oligonucleotide probe which binds to DNA or 
25 mRNA encoding the polypeptMe). Kits can also Include Instnjctions for observing 
that the tested subject is suffering from or is at risk of devetoping a disorder associ- 
ated with aberrant expression of the polypeptide if the amount of the polypeptide or 
mRNA encoding the polypeptide Is above or below a normal level, or if the DIMA 
coneiates with presence of an RAI allele Uiat causes a disonJer. 



35 



For antibody-based Mts, the kK can comprise, for example: (1) a first antibody (e.g.. 
attached to a solid support) which binds to a polypeptide of the invention: and. op- 
tionally, (2) a second, different antibody which binds to either the polypeptide or to 
the firet antibody and is conjugated to a detectable agent 
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Mentificaf Ion of an aHeie as having Impricatfon for risk of cancer 

An allele in the r region can be identified as correlated with an Increased risit of de- 
veloping cancer on the basis of statistical analyses of the incidence of a particular 
5 anele In two groups of individuals with and without cancer, respectively, anoonling to 
the )t» test, which te well known In the art Furthermore, an anele In the region can be 
identiflad as an allele oonelated with prognosis of cancer on the basis of statistical 
analyses of the incidence of a particular allele hi individuals demonstrating different 
prognostic characteristics. 



10 



Identifleatlon of humans having increased likelihood of lesponding to treat- 



ment 



15 



20 



It is further contemplated that the present inventran provides a method for identifying 
a human subject as having an increased iikelihood of responding positively to a 
cancer treatment, comprising deterniinlng the presence in the subject of a r region 
allele genotype conflated with an increased likelihood of positive response to treat- 
ment, whereby the presence of the genotype identifies the subject as having an In- 
creased IBceBhood of responding to cancer treatment 

The treatment mentioned herein may be any cancer treatment, such as oonventkmai 
cancer treatment, for example X-ray. chemotherapeutlcs, suigteal excision or com- 
binations tiiereof. 

25 Protein Products of the Gene(8) 

Gene products of the regton r or peptMe fragments thereof, can be prepared fbr a 
variety of uses. For example, such gene products, or peptide fragments thereof, can 
be used for the generation of antibodies, in diagnostic assays. 



30 



The gene products of the invention include, but are not limited to. human RAI gene 
products, and ASE-1 gene products. In the following the invention Is described in 
relation to RAI gene product 
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Gene product, sometimes referred to herein as an "protein" or •polypeptide'*, in- 
cludes those gene products encoded by the IVd gene sequences shown as posifion 
7821-21350 In SEQ ID NO: 1. Among gene product variants are gene products 
comprising amino acid residues encoded by the polymorphisms. Such gene product 
variants also Include a variant of the RAl gene product. 

In addition. RAl gene products may include proteins that represent tuncfionaily equi- 
valent gene products. In preferred embodiments, such functionally equhralent RAi 
gene products are naturally occurring gene products. Functionally equivalent RAl 
gene products also include gene products that retain at least one of the biological 
activities of the RAl gene products described above, and/or which are recognized by 
and bind to antibodies (polyclonal or monoclonal) diiected against RAl gene prod- 
ucts. 

15 Antibodies to Gene Products ■ 

Described herein are methods for the production of antibodies capable of specifi- 
cally recognizing one or more gene product epitopes or epitopes of conserved varU 
ants or peptide firagments of the gene products. Furthennore. anUbodles that spe- 
20 dfically recognize mutant fbnns are encompassed by the Invention. The terms "spe- 
diically bind- and "specifically recognize- refer to antibodies that bind to RAl gene 
product epitopes at a higher affinity than they bind to non-RAI (e.g., random) epi- 
topes. 

25 Such antibodies may include, but are riot limited to. polyclonal antibodies, mono- 
clonal anUbodles (mAbs). humanized or chimeric antibodies, single chain anUbodles. 
Fab ftagments. F(ab'h fragments, fragments produced by a Fab expression library, 
and-ldiotypic (anti-ld) antibodies, and epltope-blndrng fragments of any of the above.' 
Including the polyclonal and monoclonal anUbodies described below. Such antibod- 
ies may be used, for example. In the detection of a gene product In an biological 
sample and may. therefbre. be uinized as part of a diagnostic or prognostic tech- 
nique whereby patients may be tested for abnonnel levels of gene products, and/or 
for the presence of abnonnel fonns of such gene products. Such enUbodies may 
also be utilized In conjuneUon with, for example, compound screening schemes, as 
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described, below, for the evaluaUon of the effect of test compounds on gene product 
levels and/or actlvi^. 

For the production of antibodies against a gene product, various host animals may 
be immunized by Injection with a RAi gene product, or a portion thereof. Such host 
animals may include, but are not Hmited to rabbits, mice, and rats, to name but a 
few. Various adjuvants may be used to inaease the immunologicai response, de. 
pending on the host species. Including but not limited to Freund-S (complete'and 
Incomplete), mineral gels such as aluminum hydroxide, surface active substances 
such as iysolecithin. pluronic polyols. polyanlons. peptides, oil emulsions, keyhole 
limpet hemocyanin. dlnitrophenol. and potentially useful human acfiuvants such as 
BCQ (bacille Calmette^uerin) and Cofynebacterium paivum. 

Polyclonal antibodies are heterogeneous populations of antibody molecules derived 
from the sera of animals immunized with an antigen, such as a gene product, or an 
anUgenic functional derivative thereof. For the production of polyclonal antibodies, 
host animals such as those described above, may be Immunized by injection with 
gene product supplemented with adjuvants as also described above. • • 

Monoclonal antibodies, which are homogeneous populations of antibodiea to a par- 
ticular antigen, may be obtained by any technique that provides fbr the production of 
antibody molecules by continuous cell lines In culture. These include, but are not 
limited to, the hybridoma technique of Kohlar and iwiistein. (1076, Nature 256-495. 
497: and U.S. Pat. No. 4.376.110), the human B-ceii hybridoma technique (Kosbor 
et ai.. 1983. immunology Today 4:72: Cole et al.. 1983. Proc. Natl. Acad. Sci. U.S A 
I 80:2026-2030). and the EBV4iybridoma technique (Cole et ai.. 1985. Monoclonal 

Antibodies And Cancer Therapy, Alan R. Liss. Inc. pp. 77^6). Such antibodies may 
be of any Immunoglobulin class including igG. IgM. IgE. IgA. IgD and any subclass 
thereof. The hybridoma producing the mAb of this invention may be cultivated In 
30 vitro or in vivo. Production of high titers of mAbs in vivo makes this the presenUy 
preferred method of production. 

in addition, techniques developed for the produefaon of "chimeric antibodies- (Morri- 
son. et al.. 1984. Proc. Natl. Acad. Sci.. 81:6851^855; Neubei^er. et al.. 1984 Na- 
35 ture 312.-604.e08: Takeda. el al.. 1985. Nature. 314:452.454) by splicing the gLnes 
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from a mouse anUbody molecule of appropriate antigen spedfidfy together with 
genes from a human antibody molecule of appropriate biological adMty can be 
used. A chimeric antibody is a molecule in which different portions are derived from 
different animal species, such as those having a variable region derived from a 
murine mAb and a human Immunoglobulin constant region. (See. e.g.. Cabllly et al.. 
U.S. Pat No. 4.816.567: and Boss et al.. U.a Pat No. 4.816397. which are incorpo^ 
rated herein by reference in th«'r entitety.) 

In addition, techniques have been developed for the production of humanized antl- 
bodies. (See. e.g.. Queen. U.S. Pat. No. 5,585.089. which is incorporated herein by 
reHerence In its enflrety.) An immunoglobulin light or heavy chain variable region 
consists of a Ttamewoik" region intenupted by three hypen^artable regions, refened 
to as comptementariiy detarmlning regions (CORs). The extent of the ftamewortc 
region and CDRs have been precisely defined (see. "Sequences of Proteins of Im- 
15 munologlcal Interesr. Kabat. E. et al.. U.S. Department of Health and Human Serv- 
ices (1983) ). Briefly, humanized antibodies are antibody molecules from non-human 
species having one or more CDRs from the non-human species and a framework 
region from a human immunoglobulin molecule. 

20 Alternatively, techniques described for the producHori of single chain antlbodiee 
(U.S. Pat. No. 4,946.778: Bird, 1988, Science 242:423-426; Huston, et al., 1988. 
Proc. Natl. Acad. Scl. U.SA 85:5879^883: and Ward, et al.. 1989. Nature 334:544- 
546) can be adapted to produce single chain antibodies against gene products. Sin- 
gle chain antibodies are fbmned by linking the heavy and light chain fragments of the 

25 Fv region via an amino acid brieve, resulting in a single chain polypeptide. 

AnUbody fragments that recognize specific epitopes may be generated by known 
techniques. For example, such fragments Include but are not limited to: the F(ab')8 
fragments, which can be produced by pepsin digestion of the antibody molecule and 
30 the Fab fragments, which can be generated by reducing the disulfide bridges of the 
F(ab')z fifagments. Alternatively, Fab expression llbrarfes may be constructed (Huse. 
et al., 1889, Science 246:1275-1281) to allow rapid and easy Identification of mono^ 
clonal Fab fragments with the desired spedfldty. 
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Immunoassays for gene products, conserved variants, or pepUde ftagments thereof 
win typically comprise Incubating a sample, such as a biological fluid, a tissue ex- 
tract, freshly harvested cells, or lysales of cells in the presence of a detedably la- 
beled antibody capable of identi^lng gene product, conserved variants or peptide 
5 ftagments thereof, and detecting the bound antibody any of a number of tech- 
niques wefl'known in the art 

The biologicaf sample may be brought in contact with and immobiUzBd onto a soUd 
phase support or carrier, such as nitrocellulose, that Is capable of immobilizing cells, 
10 cell particles or soluble proteins. The support may then be washed with suitable 
buffers followed by treatment with the detectably labeled gene product specific anti- 
H body. The solid phase support may then be washed with the buffer a second time to 

remove unbound antibody. The amount of bound label on the solid support may 
then be detected by conventional means. 

By -solid phase support or carrier Is intended any support capable of binding an 
antigen or an antibody. Well-lcnown supports or carriers Include glass, polystyrene, 
polypropylene, polyethylene, dextran, nylon, amylases, natural and modified cellulo- 
sea, poiyacrylamictes. gabbros. and magnetite. The nature of the carrier can be el- 
20 ther soluble to some extent or insoluble for the purposes of the present Invention. 
The support material may have virtually any possible stnicturai configuration so long 
as the coupled molecule Is capable of binding to an antigen or antibody. Thus, the 
support configuration may be spherical, as in a bead, or cylindrical, as In the inside 
surface of a test tube, or the external surfiace of a rod. Alternatively, the surface may 
25 be flat such as a sheet, test strip, etc. Prefened supports Include polystyrene beads. 
I Those skilled In the art will know many other suitable caniere for binding anObody or 

antigen, or will be able to ascertain the same by use of routine experimentation. 

One of the ways in which the RAI gene product-specific antibody can be detectably 
30 labeled is by linking the same to an enzyme, malate dehydrogenase, staphylococcal 
nuclease, delta-S-stereid isomerase, yeast alcohol dehydrogenase, a-glycero- 
phosphate, dehydrogenase, triosa phosphate isomerase. horseradish peroxidase, 
alkailne phosphatase, asparaginase, giucoae oxidase. f)-galactosMase. ilbonucie- 
ase. urease, cafalase. giucose^phosphate dehydrogenase, glucoamylase and 
35 acetylcholinesterasa. The detection can be accomplished by colorimetric methods 
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ttiat employ a chromogenic substrate for the enzyme. Detection may also be ac- 
complished by visual comparison of the extent of enzymatic reaction of a substrate 
in comparison with similarly prepared standards. 

Detection may also be accomplished using any of a variety of other Immunoassays. 
For example, by radioactively labeUng the antibodies or antibody fragments, by la- 
beling the antibody with a fluorescent compound. Among the most oommonly used 
fluorescent labeling compounds am fluorescein isothtoeyanate, rhodamlne, phyco- 
eiythrin, phycocyanln. aDophycocyanln. oiihthaldehyde and fluorescamlne. 

The anUbody can also be detectably labeled using fluorescence emitting metals 
such as '"Eu. or others of the lanthanlde series or by coupling It to a chemllumines- 
cent compound. 

15 Diseases .... 

Described herein are various applications of gene sequences, gene products. In- 
cluding peptide fragments and fusion proteins thereof, and of antibodies directed 

against gene products and peptide fragments thereof. Such applteatlons lnclude. for 
20 example, prognostic and diagnostic evaluation of cancer and the identilicatlon of 
subje<^ with a predisposib'on to such disoniers. as described above. 

The method according to the invention may be used in relation to any cancer fonn, 
such as, but not limited to. skin carcinoma including malignant melanoma, breast 
25 cancer, lung cancer, colon cancer and other cancers in the gasbo-lnlesUnal tract, 
prostate cancer, lymphoma, leukemia, pancreas cancer, head and neck cancer, 
ovary cancer and other oyneoologfcal cancers. In particular the method is relevant 
for skin cancer, lung cancer, colon cancer and breast cancer, such as skin cancer 
and breast cancer, preferably wherein the skin cancer Is basal ceU carcinoma. 



In particular, the method is relevanl for eariy age cancer, such as early age breast 



cancer. 



Gene nucleic actd sequences, described above, can be utilized for transferring re- 
comblnam nucleic add sequences to ceils and expressing said sequences In redpl- 
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ent celid. Such techniques can be used, for example. In marking cells or for the 
treatment of cancer. Such treatment can be In the fbmi of gene replacement ther- 
apy. Specifically, one or more copies of a nonnal RAI gene or a portion of the RAI 
gene that directs the production of an RAI gene product exhibHIng normal RAI gene 
fijncHon. may be inserted into the appropriate cells within a patient, using vectors 
that include, but are not Hmlted to. adenovlms, adeno-associated virus, and retrovi- 
rus vectors, in addlUoh to other parOdes that introduce I3NA into cells, such as lipo- 



somes 
10 Examples 



The examples relate to precflcHon from sequence polymorphisms In the region r to 
cancer. Blood was ooliected before (exampe 6) or after (examples 1 through 5) the 
persons acquired cancer. However, the sampling time is considered immaterial, as 
DNA in a polyclonal blood sample is not expected to change over time. 

The particular sequence polymorphisms analysed In these examples are Usted in 
Table 6, together with their sources of infbnnaUon and their deflniUon as sequences. 
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Table 6. The markers used. Ihefr sources of InformaQon. and ihefr cumntly estf. 
mated positions on chromosome 19. as well as their position in figure 2. 



Name 



Source of Position In 
idenUflcaQon sequence 



GenBank Acces- Chromosome 
slon Position 
Number of se- (Mbases) 
quence 



Position 
in Figure 
2 



1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 



XRCCIelQ ^Tl 26152 L34079 ~ 59:420 

CWWeS remea 20076 AC005781 61 361 

XPDe23 Ref.1 35931 U7234 61^79 

XPDelO Ref.1 23591 L47234 61.491 

XPDeS Ref.1 22541 L47234 62.4923 

XPDI4 rs#1618536 19244 L47234 61.4924 

RAIee re#6966 8788 L47234 61.506 

RAMI rs#l970764 875 U7234 61.514 

ASEIel re#967591 232125 NT_011242 61.534 

ERCC1e4 Ref.1 19007 M63798 61.547 

FOSBe4 r5#1049698 34821 M8965i ' 61.601 

SLC1A5e8 ra#1060043 60620 AC008622 62.946 

GLTSCRIel fS#1035938 20775 AC010519 63.986 

UGlee fS#20560 111 L27710 

re numbere were derived from the NCBI's database dbSNP^ 

Ref 1: Shen. M.R., Jones. IM. and Mohrenweiser, H. (1998) Nonconsereative 
amino acid subslituHon varianls exist at polymoiphlc frequency in DNA repair genes 
In healthy humans. Cancer Res.. S8; 604-8. 1998. 

HATERiALA AND RAETHODS 

Study gmups. The groups of Caucasian Americans with and without BCC have 
been described previously (Athas et al, Cancsr Res. 51:5786^793. 1991; Wei et al 
Proc. NaU. Acad. Scl USA. 90: 1614-8, 1994). Briefly, the study was a cilnie based 
case control study at the Johns Hopkins Hospital, which senses multiple participating 
demiatoiagists In Maryland. Cases were histo-pathologlcally confinned primaiy 
BCCs and were diagnosed between 1987-1990. The controls were patients from the 
same physician practices and had a diagnosis of mild skin disorders. All participants 
were Caucasians Hving near Baltimore and were between 20 end 60 yeare of age 
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The consols were frequency matched to the cases by age artd sex. Cases and cor*, 
trols wJth any other forms of cancar were excluded. In the questionnaire, the study 
subjects were asked if they had any bJocd relatives with sldn cancer, and were 
asked to specif the type of cancer. Study subjects with relatives with basal cell car- 
cinoma and squamous cell carcinoma and 'sWn cancer" were included in (he group 
of sui^ecis With a family of skin cancer. Subjects with relatives with melanoma were 
not Included. At the cHnle visK the subjects gave informed consent, were examined 
by dennatologlsts. completed e structured questionnaire and provided bkiod. OI>lAs 
from available frozen lymphocytes were purified using Puregene (Gentrai Systems) 
and were genotyped. Initially. 71 cases and 1 18 controls were Included In this study. 
However, the number of persons varied between analyses, as the supply of DMAs 
gradually was depleted. In case of the SNP km ll only 133 persons could be geno- 
typed reOaUy. 



The groups of 20 psoriatic Danes with and 20 psoriatic Danes without" BCC have 
been described previously (Dybdahl et ai, Cancer Epidemiol. Biomarkere Prev.. 
8:77^1. 1999). Briefly. BCC subjects were Identined from a population-based cohort 
of pereons treated by Danish dermatologists m the year 1995. and fulfilled the fol- 
lowing criteria (a) age in 1995 < 50 yeare: and (b) clinlcaiiy verified diagnosis of pso- 
riasis. The diagnosis of BCC v»as clinically and hIstologicaHy confirmed. The controls 
consisting of psoriasis cases without BCC was selected from among patients treated 
in the year 1992-1995 for psoriasis by dennatologlsts who participated in the na- 
lionai cohort study 1995. The conirois were matched by age and sex. The patients 
with psoriasis and BCC dtff^ed from the national cohort of BCC In that the average 
25 of first BCC was 38 year against 55 year In the cohort A number of cases had had 
multiple BCCa. There was a tendency that cases had been treated for a longer time 
than the controls, and also that the treatmsnts were more intense. This was to be 
expected as treatment of psoriasis involves a number of carcinogenic traatmem mo- 
dalities. DNAs from available frozen lymphocytes were purified using Puregene 
30 (Gentre Systems) and were genotyped. 

Piimers and probes. Table 7 includes the polymoiphlsms typed on Ughtcycier'»«, the 
primers used for the PGR reaction and the probes used Ibr detection and typing of 
the PGR products. Table 8 lists the polymorphisms typed by conventional PCR- 
35 RFLP. and the primers and restriction enzymes used. Table 9 lists the polymor- 
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phisms typed by SNaPshof technology and the primers used. Table 10 lists the poly- 
morphisms analyzed on a Taqman, and the primers and probes used. Hoboith DNA, 
Hlllerad. Denmarlc or DMA Technology, Aarhus, Denmark, synthesized the primers 
in tables 7. 8, and 9. TIB Mol-Blol. Berlin Germany synthesized the Llghlcycler 
S probes. TAQ^enhagen ApS (Tagc.com. Copenhagen. Denmark) synthesized the 
primeis, and Applied Btosyetem synthesized the fluorescent Taqman probes in table 
10. 



Table 7. Design of primers and fluorogenic prabes for UghtCyder 
ASE1 ei ~ 



Forward primer: S'-GGTITTCTGCTCTGCACACQ 
Reverse primer 5'-CCTiTCTCCTTCCACCAACG 
Anchor probe: ff-TCTQCAACCTGGTGCGAGCAGC-l^oresoefn 
Sensor probe: 5'-LCRed640^GGGCTACAGGGTTACCTGAG-p 
CKMea 

Forward primer: S'-TTGAAACTGGAACTCTGAGAAGG 
Reverse primer 5 -TGGTGGATG6T6TGAAGCA 
Anchor pnobe: 5*4.0 Red 640- 

CCTTTCTCCAACTTCTTCTCCATTTCCACC-p 

Sensor probe: 5'-GGGGATCATGTCGTCAATQQACT-Fluoresceln 
BRCCf e4 

Forward primer S'-AGGACCACAGGACACGCAGA^' 

Reverse primer 5'-CATAGAACAGTCCAGAACAC-3' 

Anchor probe: 6'-LCFled640-TGGCGACGTAATTCCCGACTATGTGCTG f>- 

3" 

Sensor prObe: y-CGCAACGTGCCCTQQQAAT-FluoiBsceln 
FOSBe4 

Forward primer 5 -AGGCTCAACAAGGAAXKAATGC 
Reverse primer 5'-GCTAGACAGTCAAGGAGGGACG 
Anchor probe: S'-LCRed 640-AAAGG6TGGGTGTGGGAGACATrGG-p 
Sensor probe: S'-AAACCAACCTAGGCACCCCAAA-RuoresceIn 
GLTSCR1 el 

Forward primer S'-CGACGAACTTCTCTGAAGCGAA 

Reverse primer 5"-AGCGACACGGGCATCTGG 

Anchor probe: S nATGAGCGTCCACCTCCTGAACOIIuoresceIn 
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Sensor probe: 5^-LCRed 640-AOGCA6CAGCATCGTCATCCCC-p 

Foiwara primer: 5'-ATGCCCTGTAGGTTCAATGG 

Reverse primer 5 -TGGAGGTCTTTAG6GGCTT6 

Anchor probe: S'-GQCTGGTCCCCGTCTTCTCCTTCC-Fluoresceln 

Sensor probe: S'-LC Red e40-TCTCTGTTGCCACTTCAGCCTC.p 

Fonvard primer 5'-T06CTAACACGGTGAAACC 
Reverse primer 5'-GGAATCCAAAQATTCTATGATG6 
Anchor probe: S'-GGGAGGCGGAGCTTGCAGTOA-Fluoresceln 
Sensor probe: 5'-LCRed 640-CTGAQATCGCACCACTGCAC-p 
SLCIAS^ 

Fonvard primer S'-CAGTGTCGAAAGAGCACC - 
Reverse primer fi*-CTACCCCnTAGCGACC 
Anchor probe: 5'-LCRed 640-TCCTGCCCCCAGAGCGTCACC-p 
Sensor probe: 5'-GTACGGTCCACATAATTTTGGAQGA-FliiofBScein 
XPDelO 

Forward primer S'-GATCAAAGAGACAGACGAGC 
Reverse primer 5*-GAAGCCCAGGAAATGC 
Anchor probe: 5'-GGACGCCCACCTGGCCAACC-Ru6resoetn 
Sensor probe; 5'-LCRed64()-CGTGCTGCCCAACGAAGTQ-p 



Table 8. Primers and restriction enzymes used for typing of SNPs using PCR- 
RFLP 



Genawon Prfmere gfajfim Digested 

Fragmenls 

XRCCrexonIO TTGTGCTTrCTCTOTGTCCA Mspl 240.375bp(A) 

TATCAGAAAA6GCTGQA66A 6161^(0) 

£RCCtoxon4 AGGACCACAGGACACQCAGA BwOI 157. 368bp (A): 

CATAGAACAQTCCAGAAC5AC S25bp{G) 
Xn>»Xon6 1^1 CACMCCT66CTCAmTT(3TAT 7YH 
TCATCCAG6TT6TA6AT6CCA 
2^t TGGAGT6CTAT6GCACGATCTCT m 56. 114. 462 bp (A); 
CCATGGGCATCAAATTCCT6GGA S6.59ebp«g 
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XPD 0Xon23 l^t GTCCTQCCCTCA6CAAAGABAA 
TTCTCCTGCGATTAAAGGCTOT 

ATCCTGTCCCTACT6GCCATTC P«fl 88.100.158(0): 
TGTGAACOTGACAOTGAOAAAT 100.224(A) 



TaWe 9. Design of prfmere and SNaPshot primers for SNaPshot typing 
sequenator. 

XRCC1 exonZ — 

Foiward primer 6'-GTCCCATAGATAGGAGTGAAAG 
Reverse primer 6'-0CCTA6GACACAGGAGCACA 

• SNaPshot primen ff-TGCATAGCTAGGTCCTGC 
XRCC1 exon17 

Foiward primen 5'-GCCAAGCAGAAGAGACAAA 
Reverse primen 5'-GAGTGGCTGGGGAGTA6GA 
SNaPshot primen 

S'-AAiTTGACRAAACTAGCTCTATGQQGTOGTQCCGCA 
RM/exon6 

Fonwand primer 5*.CCTACCACCATCATCACATCC 
Reverse primer 5"-GCCTTGCCAAAAATCATAACC 
SNaPshot primen S'-CCTCTCCCCAATTAAGTGCCTTCACAGAGC 
XPO intron4 

Fbiwartf primer S'-CGCAAAAACTTGTGTATTCACC 
Reveree primer S'-CCCATTTTTATCATCAGCAACC 
SNaPshot primer S'-CTG GCTCTGAAACnTACTAGCCC 

Table 10. Design of primers and p robes for Taqman. 

XRCC1 exonio ' 

Forward primer S'-GCT GGA CTG TCA CCQ CAT Q 

Reverse Primer 5'^GA GCA G6G TTG QCQ TQ 

Probe (A); 5'Fam- TGC CCT CCC AGA GGT AAG 6CC T -Tamra 

Probe (G): 5-Vic - CCC TCC CgG AGG TAA GQC CTC -Tamra 
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Detennbtation ofpofymorphfsmsbyL^htGycler. Genotypes of the American pereons 
for porymorphfems In ASE-lel, CKMeS, ERCC1e4. FOSBe4, GLTSCRIel. LFGleO, 
RAIM. SLC1A5e8 and XPDelO and of the Danish persons far poiymorphisUi ASE- 
lel. CKMeS. FOSBe4, LIGleO and SLCIASeS were detected using LlghtCyder™ 
(Roche Molecular Biochemlcals. Mannheim. Germany). PGR was performed by 
rapld-cydlng in a reaction volume of 20 pJ with 0.5 pM of each primer, 0.045 pM of 
anchor and sensor probe, 3.5 mM MgCt approximately 7 - 25 ng genomic DNA. 
and 2 Ml UghtCyder DNA Master Hybridization probe buffer (Roche Molecular Bio- 
chemfcab. Cat No 2158 825). This buffer contains Taq DNA polymerase. dNTP 
mix. and 10 mM MgCla. In some cases the reaction mixture also contained 5% 
DMSO. The temperature cycling consisted of denaturaUon at S5«C Ibr 2 sec. tW- 
lowed by 46 cycles consisting of 2 sec at OS^C. 10 sec at 67-C. and 30 sec at 72-C. 
The last annealing period at 72»C was extended to 120 sec. The melting profile was 
detehnined by a temperature lamp fiom 50"C to 95»C with a rate of 0.1 degree/sec. 
For RA/I2 the melting profile was mn 3 times, and the last curve was used. 

PCR-RFLP analyses. Genotypes of the American parsons for polymorphisms in 
XPDee and XPDe23 and of Danish psoriatics for polymorphisms in XRCCIelO, 
ERCC1e4. XPDee, and XPDe23 were detected using PCR-RFLP technique (Shen 
et dl see above; Dybdahl et al. see above; Vogel et al. Cancer Epidemiol. Biomark- 
eiB Prsv.. 8:77^1 (2001)). The reactions were performed as imported (Shen et al. 
see above: Dybdahl et al, see above; Vogel et al. Cancer EpIdemioL Biomarkers 
Prev.. 8:77-81 (2001)). 

Daterwinatlon of potymorphisms by SNaPshot technique on sequenator. The poly- 
morphisms in RAIeS. XPDI4. XRCC1e7. and XRCCle17 in the American persons 
were typed simultaneously on an ABI Prism 310 sequenator (Applied Biosystems. 
Foster City. OA, USA) using SNaPshot technique (UndWad-Toh et al. Nature Ge^ 
neflcs. 24: 381-8. 2000.). The PGR reaction consisted of 1 /il of purilied genomic 
DNA. 1 pmole of each primer (DNA Technology. Aaihus Denmaric). 12.5 nmole of 
each dNTP (Bioilne. London. UK). 100 nmole MgClj (BIoKne). 0.15 /A BIOTAQ™ 
DNA Polymerase (Blollne) In a total volume of 20 pi of water. The program con- 
Sisted of 4 min at 96-C, followed by 26 cycles of 96X for 30 sec, 60"C for 30 sec. 
and 72»C for 60 sea The last eyde was foUowed by 72«C for 6 mIn. The primer^ 
and dNTPs were removed In reactions containing 2 U Shrimp Alkaline Phosphatase 
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(SAP) (Roche). 2 U Exonuciease I (Biolabs. Denmatk). and 0 ^ PCR reaction In a 
total volume of 14 /A water. The reactions were Incubated at 37"C for 60 min and 
72"C for 15 min. The SNaPshot reactions contained 1 M of SNaPahot Ready Reac- 
tion Mix (Applied Biosystems). O.SfA of each SNaPahot primera (XRCCe7-es1: 
6 4pmol//iI. XPDiS^pl: CSpmol/M RAIe7^1: 1pmol///l: XRCCe17-ss1: 2pmoM(/l). 
2a»I of the purified PCR product, and 1. 5 /rf of buffer (200 mM Tris-HCI, 5 mM MgOa, 
pH 9.0). The reactions were pycled 25 times: 98"C for 10s. 50"C for 5s, and 60"C 
for 30s. The primers and dNTPs ware removed in a reaction oontalhing 1 U SAP, 
0.8 fA lOxSAP buffer, and 5 /jI SNaPahot reaction in a total volume of 8 pi of water! 
10 Two pi purified product was added to 10 /A of concentrated delonized fonnamide 
(Amresco. Ohio. USA), incubated for 5 min at 95"C. and analyzed on the sequena- 
tor. The two marfcers In XRCCI. In exon 7 and axon 17, could not be reliably scored 
and thus were excluded from further consideration. 

15 pefemi/natfon of polymorphisms by real-llmo PCS using Taqman probes. The poly- 
morphism In XRCClelO In the American persons was analysed using the ABI Prism 
7700 sequence detection system (Applied Biosystems, Foster City, Ca, USA). PCR 
Primers and Taqman probes were designed using Primer Express v 1.0 (Applied 
Biosystems). The reactions were performed in WBcroAmp optical tubes sealed with 
20 MicroAmp optical caps (AppBed Biosystems) containing a 10 /A reaction volume: 1x 
Taqman buffer A. 2,5mM IVIgCla. 200 //M each of dATP dCTP. dGTP. 40Q^M dUTP. 
SOOnM each primer. 200nm each probe. O.OIU/pL AmpErase UNG. 0.025 U//#L 
AmpBTaq Gold Polymerase. Tnennel cycler conditions were: Tubes were incubated 
at 50"C for 2 mIn followed 10 min at 95»C. The incubation was succeeded by 45 
25 cycles of SS'C for 1 5 sec and 64"C for 1 min. 

Example 1 

DNA from humans from the American cohort of paUente with basal cell carcinoma 
30 and controls, described in Materials and Methods, was typed with respect to a num- 
ber of sequence polymorphisms located in and around the claimed region r. The 
resulting statistical p-values for association of occurrence of the Individual sequence 
polymorphisms with the status of patients are depicted in Figure 2. Also depicted are 
the calculated odds ratios for association of sequence polymorphism and disease. 
35 For the caiculatton of the odds ratios the heterozygota genotypes were combined 
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with the lesser group of homozygotes. and the ordering of the groups was chosen 
such that the odds ratio became more than or equal to 1. The results show that the 
sequence polymorphism RAII1 is strongly associated with disease In this cohort (p « 
0.004). Bonfenroni correction for the number of tests made Indicates that a result 
5 less than 0.007 must be considered significant at a level of 0.05. Thus, even after 
correction for multiplicity of testing this result is significant. 

The numbers next to the points in the curves are merely a help to identHy the single 
sequence polymorphisms: 
10 1, XrlelO: 2, CKIMeS: 3. XPDe23: 4. XPDelO: 5. XPDeB; 6. XPDi4: 7. RAIe6: 8, 
RAIil; 9. ASE-1e3: 10. EIRCCle4; 11, FOSBe4; 12. SLCIASeS; 13, GLTSCRIelJ 
14. LIGIeO. 



15 



25 



30 



Example 2 



Those persons in example 1. who got basal cell carcinoma before the age of 50 
yeare. were selected, and the results from analysis of RAII1 were compared the 
status of the paUanis. There was 'a strong relationship between the occuirence of 
the Individual genotypes of the sequence polymorphism and the status of' the pa- 
20 tients (Table 7; Odds retio » 12.3: p(x^) = 0.00014). 

Table 7. Ocounrences of genotype for the sequence polymorphism RM ii m Ameri- 
can with Basal cell careinoma occurring before 50 years of age and in controls. 





Number of cas 
age 


tes before 50 ye; 


ars of Numl>er of controls 


AA 


31 




44 


AO 


2 




32 


GG 


0 




5 



Example 3 



The data of Example 2 were combined with results of genotyping the neighbouring 
sequence polymorphism RAIe6. There was a very strong association between the 
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combined genotypes of RAII1 and RAfeS and the status of the patients. Thus, al- 
most an American cases occurring before the age of 50 yrs were homozygote ftor 
RAJ ii* RAl ee*. While only approximately half of the controls were so (Tabls B, 
Odds ratio = 12.8: p(x*) = 0.00006). 

Table 8. Combined occurrenoes of different genoiypes fir the sequence polymor- 
phisms RAIi2 and RAIeT in American cases occurring before 50 years of age and In 
controls. 





RAH1 




RAIeS 


AA 


AG 


GG 


BCC cases 


AA 


30 


0 


0 




AT 


0 


2 


0 




TT 


0 


0 


0 




AA 


42 


10 


1 




AT 


2 


21 


0 




TT 


1 


0 


2. 



10 



Example 4 



16 



20 



The data of Example 2 were combined with results of genotyping the sequence 
polymorphism GLTSCRIel located outside the claimed region r. There was a very 
strong assodauon between the combined genotypes of RAlli and GLTSCRIel and 
the status of the patients. It was obvious to define "risk-genotypes" as having two As 
In RAMI and at least one G in GLTSCRIel. This conesponds to the assumptions 
that RAIil* Is recessh/e, and GLTSCRIel^ is dominant If one does so. one finds 
that 26 out of 25 cases have a "rfsk-genotype-. while only 28 out of 62 controls have 
one (Table 9; Odds ratio > 30; p^ « 0.000002). 
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Table 9. Combined occurrences of genotypes for the sequence polymorphisms 
RAII1 and GLTSCRIel in American cases of basal ceH carolnoma occurring befbie 
SO years of age and In controls. 





RAill 




GLTSCRIel 


AA 


AG 


GG 


BCC cases 


CC 


17 


0 


0 




CT 


8 


0 


0 






0 


0 


0 


Controls 


CC 


15 


18 


3 




CT 


13 


7 


0 




TT 


3 


3 


0 



10 



15 



20 



25 



Example 5 

Dl^ from humans from the cohort of Danish psoriatics with basal cell carcinoma 
and controls, described in Materials and Ivlethods, was typed wRh respect to a num- 
ber of sequence polymoiphisms located in and around the claimed region r. Tlie 
resulting statistical p-values for association of occurrence of the individual sequence 
polymorphisms with the status of patients are depicted In Rgure 3. The results show 
that the sequence polymorphism ERCC1e4 is strongly associated with disease in 
this cohort (ps 0.01). 

Example 6 

Blood samples ware ooliscted from a large number of Danish citizens and frozen. 
After a number of years those women, who got breast cancer in the Intervening pe- 
riod. were identified, as well as a set of matching controls. DMAs were purified from 
the blood samples of these persons and a number of polymorphisms, namely RAII1 . 
ASE-leS and ERCC1e4, In the region of Interest were typed. The polymorphisms 
were subsequently combined such that the high-risk group was homozygous for the 
hlgh-rlsl< alleles of ail three polymorphisms: RAII1^ASE-1e3®«ERCC1e4°°. All 
other genotypes were combined Into the low-rlsk group (Table 10; OR = 1.59; p^x') 
= 0.004). 
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Tabte 10. Occurrence of a combined "high-risfc- genotype RAIi1*^^E- 
1e3«»ERCC1e4'^o as opposed to aU other combinations of genotypes for the se- 
quence polymorphisms RAII1. ASE-eS and ERCCle4 in Danish cases of breast 
cancer and controls. 





High-ifBk 


Lou^rlsk 


Cases 
Controls 


120 

277 


65 

312 



10 



16 



The DMAs in these examples were purified from available frozen lymphocytes using 
Puregene (Centra Systems). A vadety of other ways of purifying DMA Is available to 
the expert and woUld also be ej^aded to lead to the wanted results. 

Analysis of sequence polymorphisms can be perfonned with a variety of techniques, 
some of Which have been used in the examples of this appHoatlon. Most often a 
number of techniques can produce the wanted resulL 

Similarly, the choice of primera and probes in a particular assay is to some extent 
free and other primers and probes might well produce similar results. 

Finelly. it Is to be expected that assays for other sequence polymorphisms In the 
region of interest may produce roughly slmBar results. Our particular choice of se- 
quence polymorphisms and assays used in the examples are thus not Intended to 
limit our claims. Thus, at present about 30 SNPs within the region r are listed in 
I NOBIS database dbSNP Including re«2070830. re#2017104. re#2017154 and 

is#2377328. all within or very dose to RAI. Other fomis of polymoiphisms such as 
the tandem repeat polymorphisms D19S543 and 019S393 are also known to occur 
In the region and can probably senre as mariners In the present invention. Moreover. 
It fe very likely that the region contains a number of as yet undiscovered poiymor- 
pWsms. For Instance, the sequence of the 5' half of RAI and its upstream promoter 
region is curiently only a draft veraion and new polymorphisms of potential use for 
this invention are likely to be uncovered as more sequence reads of this segment 
are produced. 
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Sequence of the r region cM chramosome 19 PV8 

The followino depicts the region r stretching from the beginning of. but not including 
the XPD gene, to approximately the end of ERCC1. and Includes the genes RAI. 
5 LOC162978, and ASE-1. More speaflcally r is bounded by and includes the follow- 
ing two sequences: AGAACCCCCO CCCCTCCACC TCGTCTCAAA and 
TCCCTCCCCA GAGACTGCAC CAGCGCAGCC, and Is defined by SEQ ID NO: 1 
herein below: 



SSJi^^TCCATTTCAAATATTAATAATAATAACTAATAAATAAAfl^T^ 
TTGTpGmCTTCAACAAATAQCTATGTGGCATCT^ATOT^^ 
?o ^.^S5i?^^^'^®^<5GTGACCATGAcJCGCCTC^^^ 
^° 5^ini7^*^^^<2'^<5ACAGGGTCTTTCTCTGTCGCCAAGGCT6<SSTG^ 
CAGTCACAGCTCACTCCAGCCTCCACCTCTTGGGCTCAAGCGATC^ 
?^?P-iiS^®°^°®<3ACCACAGGTGTGC>Src^A^^ 



C^GbT^TGCCTG^^^^A^SS^ 

cacgaggtcaggagttcaagaccagcctgctdS^gqtoaSc^ 



50 <3TCCCAGCCT6GAGTATGGTGGTGT6AATTTG6CTCATOcScc^ 

UAtCACGCCTGGTTAAl 1 1 1 J 1 1 IAATrrnTGTAGAGACGAeGGTATC?rCACTATGTT- 
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^jAAGTCGTGGGATTACAGACGTGAGCCACTGTGCCCGGCTTMmATTT^ 
ATTTTTTTATGTTTACrrTTCTATCTCCTACAGQAAGAAAATATAm 
GGTCTCGCTATQTTGCCCAGGCTGGTATTGGGCT<5S^CAT^TCT^ 
TCCC7y\AGTACTGG6ATTACAAQCGTGAGCCTCTQCATCCA6C^ 
TTACTGTCACCTACAGAGTCCTCTGTAACTAGCTTACTGCTCATC^ 
CCACCTTACTGCTCTGATCTCCTCCTCTCTCTWCCCAGCTCAm 

^IS^IS]B9II®?T"QTctctaaaacataacaagcacatSc/?ct^^ 

CACCAGCTATTTTGTCTGCCTGGAATGCTGTTTCCCCTGATAGCOXTOT^ 
^^^ISii$SICCCTCAGCTCmGCTCAATTGTCAACRCTrcGG 
S^^H9I'^^*^^'r^«^'^<^T^6GGAGeCTGAGCTGGGCA^TSvSrKS 
GAGTTCGAGACCAGCCTGGCCAAGATGGTGAAATCCCGTCTCTAOTvK^ 
oIJ??Ji?.'^^^iI°^I*®^'^^'^T''^CCAGTAATCCTi^^ 

CAGGAGAATTGCTGGAACCCGGGAGGCAGAG6CrGCAGTGA(»cS3^CAT6C^ 
CACTGTACTCCAGCCTGGGTGACAAAGCAAG^CTCTCTC^ 
^I^?.TS^J;?i^^^^^Tr5CTGACCAC(W^ 

cacgcacgcacqcacgcacacacacaovcgcvvcqcacqSvcXcaca^^ 

X*^^*!S?ISi^^*^^^'^<5^CT*rrGGQAGGCTQGAAC»GGTGGATCATO^ 
CAGGAGTTCCAGACCAQCCTGACCAACA«MeT«AAAnrjT«A^??x^ . 



10 



IS 



caaaaattagctgggtgtggtgtcgcatgcctgtS^^ 

OK S??f^ZE5?j^S^J?5'®®GCAACAAG . 

gaggtagagqttgcagtaagccaagatcgcgccattccict^ 

30 eACTCCGTCTCAQAAAOQAAQAAAQAAGQAAAGAQAGAAAQAQA^^ 

^^^^^-?^^A'^i'^'^°'«5^eAGA^^ .. 

?Sg^^ft?f^^?^:?T^'^^°^^^ ^' "" 1 1 1 1 1 1 1 mm 1 1 i ^ agacagggtS 



TCACTTCTGTTGCTCCAGCn^GTGCAGTGGTGAGAACAT^^ 



40 



45 



^.T-.T^r-.-rrz-v/irry!^;^"' "' ■ ■ v» i vato va i aabaatttccagaactctaat- 

GAAGAAACTQACTOGTTTATATTTTATTTTATmATTTrATTATTTTTGAGA 
^^S^^SSSS^'^^^^'^^TCCTGCCT^GCCTCCCCAGGAGCT 



55 



SiSf^P^^^'^^°°<3AGAAAAAGTTGGAGAATTTAAGAGAA^^ 
?ffr17^*l^?IS?5^^r^^F'^T^«^ATATTCCCAGc7S^^ 
^Si^^'I^L^'^^'^^^^^^Q'SAATrCAAGGCCAGCCTCGGCAACACGGTG^ 
CTGTCTCTACGGAAAATTAAAAAAAAAAAMGAQAGAGATTAGTGGGAT^ 
TftTAGTCCCAGCTACTTGGQAGGC^^ 

OA^ZIPP^^^^^^^^'^T'^AGTGAGACTCATCTCAA^AA^^ 

CACTAGAGGAAAAAAAAACTAAAGTGGGGTTTGCGGGrA^TCGG^ 

$I^S3II^^^T'^TGATCTCCAGGGA6GCTCCACGG^^TC^ 
TCAGrirCTAGAGCXJAAATTCTTTGCATAftrTTW^rT^;^^ 



CTTlACCriTCAAAGCTGGC^GCTAGCCTCTGGCTCAAGT^CA^^ 
GTCTTCCTATOCAATCTTCCTCTTATAAGAACATireG?^ 
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* ^^P/^^QCAGTCCGGCCTGGGCGAAACAGCeAGACTCCGT^^ 



30 



40 



45 
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SSP^^^TTmGTATTTTTAQTAGACACCAGGTTT^^ 



??^*°*TS®^®'^^<^^®TAATCCCAGCrACTC^G/^T^^ 
E5:'^^°®'«3GCAGAGGTTGC^GTGAGCTGAGMTG^^^ 



cS2S^*?S2Ss^"^°®°^^^'^®®TAGAGACCAATGT^^ 



CGGTGGCAAOGAOSACCACCAGCAGQGTGAG^^ 

cqacgcagctgaagctgtgtciiccaaaaaSaSS^^ 
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TATTGTCACCCAGQCTGGAGTGCAGTGGCACCATCAG6GC^^^ 
gi^T^^r rr ^ff ^^^^^^ ^ 1 1 m ] 1 1 1 1 r oV^^rFfAG-g^TSZ^^ 



^?CCAQCCTTWCTACAATTCTGTAACCCACCCACC^ 
CCCACATCTTATCACrGCACTCCAGTCTGGGTGAwSS^SS!^/^ 

^I'^^^taaatacaaattggccgggtgcg.gtggctStocctwa^^ 

GGAGACCAAGGCAGGTGGATCATrrGArsrTTP^o-fiJai^^ 




CGACTCGCCCTCGCGGAAISGACACCTftfiTr^^^^ 



AS 



f^COTQQGGTCCAQGACAQAOXn-GGAATIT^ 
^I^^TSI?J®*^"^®CTOT^6ATGGGAACCG^^^ 

^gSSS^^^C^^CGCCACCACGCCTdGBTS??^^ 
S^?S3nS^S®TGTrA6CCA6GATGGTCTCGATCTCCTC^CcA^^ 

CAG<XrrTCCAAGTAGCTG6GATTACAGGTG7^^^^ 

?^S3i^GCACCTTCCrrGAATGATAGGGTCCmA^ 
®^®I^^®Jg?CGCQATrrCGGCTCACTGCAACCTCCGCCTCCTCGGTrC^^ 

CAA06TCTCACTCTGCCACCCA6G?reS6gftSS^ 
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CCTACTTCQCCCTGCCCGTCGGGGMCCAGGTQATOTAGcSraOQ^ 
TAGQQTACAGCCTTGTGTCm-CCTACAAGCCCC^Cre^SS^ 
CTGCCAGTGGTGTGGCAATGCX;TCrrCCCACAA(yrGGcXSE3oSAeCT 

20 ccctatgccaggtaqatqgcagggttgaXXcgttc^^ 

?^^^J^i^°^C«^'^T'CTTCACAGCCACTCTSrScSS!&^^ 
CATAGCACAGCXrrCCATQTCCCCTTTTCCCTTAGGAGQGC^OvSs 

cgcaagcggtccatccctcatcctcctcctcogoSuSct^^^^g^ 
cagcccccatacccttctctccctagtaqggggtagS^ 

^ ^^S^lCCCGCCAGGTACCCAGGCGCC^cSGroiTicCTTO^ 

?eCGCA6TCAGCATAACCCTCGCQGTAAQGQTCGCA^:TTCT^^^ 



10 



15 



35 



40 



S9$£?®^®^^*^<^TCCC6CTCAGCAGCGCTCACCTCCTOAC^^ 

6|?actgggattccctcgqqatggggt<Sggqggt^^^ 

5?'S^I?iII^55f^®^<^"rTTT"ATTATrCCrrGTATT^ 
TT f 1 1 1 1 1 1 1 1 i I IGAGATGGAGTCTCGCTCTGTCACCCAGGCTGGAGTRrA^ 
^TeQTGTQATCTWVGCTCACTGCACcS^ 

TATTCTGGGCTGGGCGAGGTGGCTCATCTCTGTAAT^ 
C*GAOATCCO.AWOCCAOreMTTCMAA™SS« 
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^^GGACTTTOGMTCCAA^ 

TCI I IN M IGTTGGTTTnTTGAGACAGAGTCTCGCTCTGTCGCCCAGGCTGGA^G. 

CAGTGQTGCGATCTCA6CTCACTGCAAGCTCCG<XTC3CCGGGTTCAQQCCATrcT^^ 

TGCCTCAGCCTGCCAAGTAGCTGGQACTACG6GCG<XC<S;CA 

JIG]^™AGTAAAGATGGGGmCACCGTGTrAG^^ 

S^SHHK^^^^CCCACCTCGGCCTCCCAAAGTGCTGGGATTACAGCTOT 

i^^^J^S^'*'^^'^®°^®°°°°TGGTQGTGCACGCCrrGTAATCCCAGCTACTC^ 
^?^^P?J^®*^^<^®^'r^AC'^<5AACCCAGGAGGCACAGTTATAG^G 

TACTGMAATGTTAMTATTATCTGGGAGTGGTGGTCCAFM 

^t^^w'^ll^S^^^^iiP^^^^A^^^ATCAATTGAGCOCAGTAO^ 

™?P^T®A^<^^CCACTGCACrrcCAGCCTGGGCAACAGAGTGAGACTC^^ 

I99^T'^^'^T'^<^CTCT^ATTACAACyVTATCSAGTCCATGAAT^ 

CAAMTATGAGCATCmAATTGTCAOATTTGGTCACTI^Q^^ 

CAQTCTATGATACTAACTTTATAATrATTTTTmAAGAGAASoTT^^ 

^I9I9®GCTCACTQCAGCCTCTWCTCCTA6eTTCAAGCAATTCTCCrGC^^^ 

TCCCGAGTAGCTGGGATTACAGGWVTGCACM^CCAGGCCCAGCTi^ 

TAGCAGAGACGGGOmCACCy^TCTTGGCGAGGCTAGTCTT^CT^ 

r^?J^I??5.^?SSSr5<^CCTCCCAAGGTGCTG^^ 

SSI^P^^^^T^^^'^TAATTCTAAGATCGTGTTCAAACCTTTAAATGCT 

CTCT^AAATGTTACTATCCTAAGACGQTGACACTAGCGmGAT^^ 

S^^JSJjSn'^™^'^C°^CTTGATn'CAAAATCCTC^ 

S^^Sn3^°*?l5P®*30CTAGQAATA6GCATTrTGGGGGGGTC 

CTTCTCTGAGAAGTGATCTCITCCCGCTGTCTACGCACACGGAGTG^^ 

TCCATGTGGCTACAACCCTCTTCCCAOTCAAGATGCAGG^CC^ 

GACCATCCCCTGGTCCWVATGGTGACAACAGTAACWV^^ 

TATTTATTTATTTATTnTATnTrATT I 11 f rT G AGACGGAATCTTBCTCTGTCACO 
Si?^^^I?3^J^^^S^^*^*°^<3TAGCTGGGACTACAGGCGCCCGCCACX;^ 

XASS?SriI^SS°'^™^C^<3'^CTCAGCCrrCCTGAGTAGCTGGOATTGGAATGA- 

GACCACCACTTCTCCTGTTGTCCnTCCCAQCTTCTCCCCCACCTa^Tm 

TTATAAGACAGGAAAAAAAGGGAGAAAGCWVVACGCTra^SJ^ 

I^^I^2?J'^®a'^°accttggcgccaccatctgScct^^ 

ATAATATTAATCCCTGACCAAAACTACTGGTGTTATCTC^ 
GTCTGTQG(n^CGTCCTGCTACAGAAmCAGGCACGCGCCACCGCT<»S^ 
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10 GAGGGAGAGAAGAGAGGGTGACTGCmGATTCAGGCAAG^rraSSS^ 
ATQMCCCACTOrrQTGCCAAQACTCAQT^CATGTCS^^ 
GAAGGAATGCCCGGGGCAGGGCACA6G(^QQTTMTOGAQ/^^ 

. S*T5???o*cAOCTCTxxxm!ceecceoroe^ 



35 



TACCAACCQTQATAQATQTQATTrrCTQAGATCCTOAGAffr^^ 

ATAATACACACACACACACACGGCTGAGCATGGTGG^^ 
TpGGGAGGCTGAGGTGGGTGGATCACCTGAGGTCAGGGGTTCG^ 
CCAACATGGCAAAACCTCATCTCTACTAAA^XcACA^^ 

GGAGGCGGAGCTTGCAGTGAGCMG^CC^GCACT^y^^^^^S^^ 
TAGGGTAGATATTATTAATCTCTCTTCAWSX^ 



50 
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s S^f;^f;^^ACTGCAGCCTG6ATCT^ 
S I?^S^®Sl®??i?^ZACAGGCGCCCACTACCA^ 

^AT^JXT^PI^SISI'illl^AGTrTCACTATGTTGCCCAGGCTGGTCTTGAACTC^ 
GAQCTCAAGCAATCCTGTCTGCATTAGCCCACCAAACTGCTAGGATTACAAGGGT- 
?r'-^^FJS?SI^'^'^"^'^'r6<3TAGCTATTGATAGCTTAC^^^ 
J^T77^jni^'^^^"^®^^^'^°'^QTCTCACCCTGTCAC^^ 

ggcatgatcttggctcactgccacctccgcctccttggctcaagctgactagS^ 

Sr5S^SSfiI5S5^^*^*^^'^^^^®TATTmAGTAGAGATGQG 
$^3°?I?pTCTCGAACTrCTGACmGTeATCCGCaX3CCTC66CCT^ 
I^^f^ZT-^^^^SGty^TGAGCCACCGrGCCCGGCCCATTATTTCCCTT^ 

^AA?n^^5IH'^°°H'^*^®*5®TC(:y«:ACTTAACCT^ 

CAAATTCCCAAACTCACCTGGCCTAGCTXn-CTGCAGGQACAGTGCTTGT/Si^ 
S^J][^S^?J®I9$FCTCCCCAC0TC^ 

^^J?™*!^5?^*?'iI?*^®®®®®*5*^QQ<3QTAQ6GGm 
SHTil^9®°^®^TGGGGCTGTGQTreTGATTGT6GCTGGGGCTGTC^ 
^??^^J^5^®^™®*^®®®°®TG(n€GGGTGAAGAGGG66^ 
GGCQCQGCTGGCCCCGTGCTOJCAGAAGGCQTTCTGCAt^^AGATC^ 
?^S?J?^^"^*^Q<3AC6CT6GC6C6GG6CCC(X5CGGGGCTG 
CATGGGGATGCGGCTGACGGGCTGCCAGCTGCGAGGCAAAGTCCX^^ 

Sa^^^S^S^^^^^'^^CGCGGCAGGGTGGCGCTAGTGAGGTTGTCCTGGG- 
?ffS?S?*^^?5i^P^QQ®'^'rCGGGTGAGAGAG6GAAGGTGGAG6^^ 
^^*l^5!^5J^i^?^®®®^^<5®'^<3GTGAGGGAAGCCCTGGG^^ 
^^^^i5^?.^I^?S^^°°^°^®AAACCCAGCACAGTGAAGGGAGAGC^ 

S^?G^c^G?^SS.^I^^^^^ 

^ifSI?^S?S?ISS^'^^<^«^T«TGGCAGG&GGCGbGGT^^^ 
^^Sii^S^^^^^'^^^^^^TCCGCCAGGGGGCTGCCGCGGGGGSkGC^^ 
S5®^"^'^®*^°"rCCAAAGGCGTGGGGGGACCCTGCTGGCGGJScGG^ 
50 CGGGCCQCQGGGAGGGCGCACGGCCGAQGGAGCTQC^^^ 

J^°^LQ?GGGGTGC6CGGCGACGACGGCCGTC^ 
^^ISJ^i^Sjn^S^^^^CACTCTCTGATCGTCCGAACGGGGTQTCTTO 
GGCCGCCrrCCGGGGGGACCCTCGGCTGCCGAAGGGCTCAGGGATOQA^^ 
^?TACCGGGGCGGCTGTGGGGAGGCCAGGGCATT^^ 
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TAGTG6TGGGGAGACA0C»CTQAAT(W3CACTGr66CTAGCOCArwS^ 



56 
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CAWnTTATQCCACTTAGGTAAAO^CCTCnTTCCTTCTGAGGGTCCCmAGATC^ 
SACTTCCACTGQTCCCCTCTTTTCTATn-CTTTC^ 

TCTrrCTTTCTTTCCTCTCTCTCCTTCCTTTCmCTCTCTCTCTCCTTCC^^ 

X?II^5I3JJ?J'^£I!T^^'*T^^®<^TCATTGCAGC^ 

SrS^I*^°ISP^^°'^°CCrCCCAAGTAGCTGGGATTACAGGTA^^^ 

CATOTGGCTAACTTrrQTATTTTTAGTAGAGACAGGGTTTCACCA^^A^^ 

^?ISF^^"^'^C"^^<=CTCAAGT6ATCCGCCTGTCTCTGA^rOTr^^ 

S^^?PP^TS^^'^°^°"^<3CCCAGa^GAT^mAAAAAA^SS^ 

?5i^^S7^;^®^^^^'^<3CAATTCTCTCACCTCGCCTTCCAAAOT^ 

^^^^^^^^^^^^^ 

CAGGGAGTCAGACGCCGTCAGGAGCGGGACW^CGOCreMCT^ 

CAAGTCC ATGCGCC ACG GAGAAGTCCAAACCC^ GTCTAAAACCTCCBtSui^^ 
I9y^^^^^^^^ g ' ' " IC I 111 11 1 1 1 M 1 1 1 1 1 I G TGTVOGTG-reTGi^^ 
GAGTXn-CGCTCTGTCGCCCAGQCGGGAGTGCAATC^GCGAT^^ 

o^Ig^E^^™^-™-^^^^^^ 

S2^Si?f^!555?°*^*^°^<^T®CCTG6CTAAATrm 

SI?I^^^3??S5SS**^^°°®'^°<5CCAAGGAGGGCGGATC^^ . 
2S!SfI?I25SIS"*IF'='=*''CTACT<M<SGASeCrGAGGCAQMGec^^ 
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CACT0C CTGGGCCTCAGTCTCCCCAACATCAAAOAARAAffiftrAAATr^Ar»r i 

TAGGTATAGCCTCGCACCACCACACCCAGCTA AI i 1 1 1 1 1 1 1 1 1 I 1 1 1 | N 1 1 1 N i ? | i. 

J??^Sr^^SJ2^^®°^^°®C°TCCCGQGTrCACGCCATTCTCCCGGGTSScCT^ 
5f^?J^?SI^?S5:S^'°^^®°C°CCCGCCACTACGCCCG6CTA^^ 
X™^^*^°'^"^*='^°*^A^^AQCCQQGATGGTCTCGATCT^^ 
10 S^?^?iiI?F.^3SI2?CAAAGTGCT6GGATTACAGGCOT^^ 



20 



25 



£S?I^SSS9^®®'^^^CCGCCAGGAGGGC66ACTGCGCCTGAG^^ 

i^SICrQTGGACGTCTATGGAATGGGCTAGG^^^ 
QAACAATGCAGTGCGC6AAATAAAQTTCACACATAS^GGA^^ 
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GAAC66GGAGATAGACA6AG6GAGGTGCCTGGTCCTGCCCTCCCCCCGCCTCAAG. 

*^^$5^*^'^*^^'^®^^<S<3®CQTGQAGGTGACATTTCTCACCTCAGGGTCAGGGCAAG. 
GAGCCCTGAGGCAGAAGQTTAGTCAGAAAATCTGGCGGGGGCGGATGGAMXC^ 
5 0GTCCCCCAGAGAGCTGCAGAAGAA6GAGGAGGCAGAATCCTGACCCTACAAACTC 
I^^SSJaTSl^'^T«^^<5CCTCAGTrTA0CCCT^ 
^I^^<=GGCTAT6CAAACCTCCCAQAATCCAATAGCCGCmCt^GAAT^^ 

GAGTOCACCTCC6CCAGGG6GT6CCCTAGGGCGGATGTCCCTTCTCTGGTTAG(^ 
^® CAGGTCTGACQCCCAGGTTAATGACATGTTGGGTTCGCTCAGCGGCACAGAGGAG. 
SCI??f^®^TSI°CCTCQGTGTmCT(rrCCTACCCCGCCCCCATC^^ 

OQAAAAGTCGGGGGAGAGCCGG6ACACAGCCTCCGGAGGGACCCCGGGTACCT- 
6TCCTGCTCCA(rrrCAGGAACCCAGGCTTCACTATCCCTQCOcS^^ 

^ S^^Ii?^t^^T^?^S3*^^°^^A^TCTGAQ0CTCTCATTTCTQACC/W^^ 

tcgaacccctatactacccaaaqactcggcttcctagagcctoccagttcmggSS. 

TCAGGAATTC<1\GCTCCAACGTCTCCCC6GGATQAAGG^^ 
CAAGAATTCAGGCATCCGAACCCeCTrrCCTTCC(?RSSSr>wSAC^ 

20 CGTCCQAQTGCCCTCCCAATCCTCCCAGQGACXSCGQQTGTTGGGCTT^^ 
]5J5SJSS?5^iJ®®*®<3GTGAA/^CTCACG6ATCCG6GCAGATOC^CA^^ 
S^^^T^^'^^U^°°°°"rCCeQCTTGQGGA6CGGA^A^GGG 
TGGGAACAGGTTAGAaSACGTGACTTGQGCTGGAGGGAGGCGGGTCC^ 

o« I999*^®^*^°'^T'^*^AAGCGCGG6TTe6ATCTTCAAA6GATGTCCCAGCAAGAGTT- ■ 
30 ^^15^^TCTTAG7TTG^ 

TCTTTGCCTCGGGTCi^GTGGQAATrQTAGTCCTGGAGCCCGCAGGGCTCCAC^ 

3® GA^CTGGCGCTOACACACTQGCCAQQAATQCAQTC^ 

S?ISI*^®^*5^^^"=*=^C^<^C<»ACGCGGGGCCGCCCCAGT6GGMGG^ 

r?S$S?i?^SHSS£?iiIF^°®^^TTCCTGAAAGTCATCGAGGTTrCCC^^ 

$?f^S??^5^^<5"^<^"rGAACTTGTGAGGCATCTG^^ 
I^^SSf^^^^'^^^^^QCCCTACTAACTAGTATTCTrACCTGTCT 

^® Sff^TCGA^GTCCTGTCTCTTTAAGGCTTAGGAAG^^ 

I^iiSS5l^S£53I°CAACAAACCATATTGGACAGACGATGGGGGCGACCCA^^^ 

GACCOGACGGGCCTCTGACTCCAGCAATACAGCGAATCAGCGGCTTT^ 

CATTm^GAAAAAGACTTCTTCCTCGGTmCTGCTCTGCACACG 

«i CSi?3Z33T^HS?^®A'^°<3GGAGTCGAGCAATGC^^ 

^ ?^^?6®C®CTWC6GATGATGCC^^^ 
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TTCTGCTGCAGATGCTGCTCGGTTCTCTTGTCCCCCCMCmACCGCGMGCCCC- 




°ISSSA®ATCQTC>^GGC^TTGGCAGGCAAGCGGCAC^^ 
Si^S^lS^^S^^'^^^CTGGAGAAGCGACCCTGCTGGOCCCCTCAACGGAGGCAG^^^ 

CAGCAATCCCTGTCAGGGAQtXXTCTGCAGCCCATXXJCAGCAAQTCCCCCACCAO^ 
S^I^9P^^°^^«AOGCCTCQQriCTGTGCCTTTGGTCGSUc^ 

S^iS^I?£^^52I°'^**'^®°**°°'^^CTCAGGAGGCAGTGAATGGGCACG^ 

CCTO<3AQGT6GACAT6GCTTTGG66TCGCCA6AAATGGATGTGCGGAAGAaS5J! 
eAAGAAAAAAAATCAGCAGCTG^ 

QAGCCCACAGTQGAQACACTGGAGCCTCTGGGAGTGCTGTTCCCGTCCACCAC^ 
^ ' ^^?Sii^^?^9«^<5eeAAAQAAACCTTCGAGCCA^^ 
. GAAGCAGGAACAGATTAACACTXSAGCCTCTAGAAGACACAGTTO^ 

CAAAAAGAGAAA6AGGC AAAAARrtftACrift A A AO/s a t»/!> a r^rse* A/«Ai«<«»^M 



w^rwawfiwuM^ •'wuAi^eAeccTCTAGAAGACACAGTCCTGTCCCCGAC- 
CAWW^^^G/^GAGGCAAAAG6GGACeGAAGGGATGGAGCCAGAGGAGGGGG 
™^SII?^i5I9^^°^^^^®T®AAGGTGGAGCCACTGGAGGAAGCCATCCCTCT. 
-^^ ©^CCCCTACGAAGA^A^ 
3^ 60ACOGA6GCQATGQAGCCAGTQGAGCCGGA 

TCGCAGCTCCCAAAAAGAAQACQAAGAAAGAAAAACAGCAAGATGCaScAGTC 
$^^S^?^^^?^°^*^°T<5G6GCCTGA6CTGCCGGATGACCTTCAGTOtSg^^ 
I9SS5L<1^I£S'^*^^'^Q'^G'^S'^«3AAGAAGAAAGAGAGAGGTCACAC^^ 
^ Si^9556IH'^CCACTAGAGCCTGAACTGCCAGGGGAGGGACA6CCTGAA6CCAG. 
GGCAACTCCGGGATCCACCAAGAAQAQQAAGAAGCAGAGTCAGGAAAGCCGa^ 

Example 7 

The cases and controls in example 6 had been Individually matched with respect to 
50 age, menopausal status and homnone treatment. Therefore. It was possible to make 
a paired analysis. This generally reduces the possibility of bias and confounding, but 
often produces less significant results. When the "hlgh^k- group was analysed, I.e. 
RAIM** ASE-1e3«» ERCC1«» versus all other genotypes, we tbund a rate ratio 
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(RR) * 1.64. Confidence IntervaJ (Cl) = 1.17-2.29, and with a level of signHicanca p 
= 0.004. Thus, the •hlgh^Jskr genotype was cteaily overrepresented among the 
breast cancers. 

Examples 

In the data of example 7. the 'high-ilsk' group was further analysed. f.e. RAIIl** 
ASE-1e3«» ERCCI''® versus an other genotypes, among those pairs that were less 
than 55 years of age. This Increased the difference dramatically, indicating that the 
high-risk genotype predisposes to early breast cancer (rale latlo (RR) = 9.5, Confl- 
denoe Interval (CI) = 2.21^.79. and with a level of significance (p) = 0.003). in 
older age brackets, the RR was still above 1, but not signlflcantiy so. Thus, the com- 
bination of the three SNPs allows fior the deflnltlon of a higfHisk .group for early 
breast cancer. 
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Claims: 

1 . A method for ostimating the cancer risk of an indMdual comprising 
5 - providing a sample from said Individual. 

• assessing in the genetic material In said sample a sequence polymorphlam 

- In a region corresponding to SEQ ID NO: 1 , or a part thereof, or 
0 - in a region complementary to SEQ ID NO: 1 . or a part thereof, or 

- In a transcnpUon product from a sequence In a region corresponding to SEQ 
ID NO: 1 , or a part thereof, or 

J • or translation product from a sequence In a region corresponding to SEQ IDi 
NO: 1, ore part thereof, 
13 - obtaining a sequence polymorphism response, 

- estimating the cancer risk of said individual based on the sequence polymor- 
phism response. 

20 2. The method according to claim 1 . wherein the cell sample Is a blood sample, e 
tissue sample, a sample of secretion, semen, ovum, a washing of a body sur- 
face, such as a buccal swap, a clipping of a body surface, including hairs and 
nafls. 

25 3. The method according to any of the preceding claims, wherein the cell Is se- 
lected from white blood ceils and tumor tissue. 

4. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least one mutation base change. 

30 

5. The method according to any of the preceding claims, wherein the sequence 
polymorphism comprises at least two base changes. 

8. The method according to any of the preceding claims, wherein the sequence 
35 polymorphism comprises at least one single nucleotide polymorphism. 
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7. The method according to any of the preceding dalms. wherein the sequence 
polymorphism comprises at least two single nucleotide polymorphisms. 

5 8. The nrathod according to any ofthe preceding elaitrn,wherefn the sequence 
polymorphtem comprises at least one tandem repeat poiymoiphlsm. 

9. The method according to any of the preceding claims, wherein the sequence 
poiymoiphlsm comprises at least two tandem repeat polymorphisms. 

10 

10. The method according to any of the preceding claims, wherein the cancer is se- 
lected finom sMn eaio'ncmia including malignant melanoma, breast cancer, lung 
cancer, colon cancer and other cancers in the gastro-intestinal tract, prostate 
cancer, lymphoma, leukemia. (»ncraas cancer, head and neck cancer, ovary 

15 cancer and other gynecologteal canoere. 

1 1. The memod according to any of the preceding claims, wherein the cancer is se- 
lected from skin cancer, lung cancer, cokin cancer and breast cancer. 

20 12. The method according to any of the preceding claims, wherein the cancer is se- 
lected from skin cancer and breast cancer. 



25 



30 



13. The method according to any of the preceding claims 1 0-12, wherein tine skin 
cancer is basal cell carcinoma. 

14. The method according to any of the preceding claims, wherein the assessment 
Is conducted by means of at least one nucleic add primer or probe, such as a 
primer or probe of DNA, RIMA or a nudeic add analogue such as peptide nudeic 
acid (PNA) or locked nudeic add (LNA). 

15. The method according to claim 14, wherein the nudeotide primer or probe Is 
capable of hybridising to a subsequence of the regten corresponding to SEQ ID 
NO; 1 , or a part thereof, or a region complementary to SEQ ID lMO:1 . 
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16. The method accordlr^ to datm 14, wherein the primer or probe has a length of 
at least 9 nucleotide or peptide monomers. 

17. The method according to any of the preceding claims 14-16. wherein at least 
one primer or protie is capable of hybridising to a subsequence selected from 
the group of subsequences 

1 . GCTCTOAAAC TTACTA6CCC(A/6}6TATrTATGG AOAGGCATTT 

2. GTGGTCAAAT TCTCATTCAT CGTQG (TIC) OCAGOCAAGC 
ACACTTCCTC 

3. ACCCTGAGGT GAGCACCTGT TCCTT{C/D TCCTTGCCGT TAGCCCA- 
GAG GTAGA 

4. GGGCAGGGGT TTQTQGCTCC AATGA (G/A) GACAA6CTCC 
CCCTGCCCCC GAAGT 

5. CCTG6CGGTGGCCGTCACCAGCTTT(T/C)GGGGGTGTrT 
6GGAAGCTQG 

6. CTCCAGCCCC ACTQTTCCCT (A/G) G6CCCTATTG GTCCCCCTGG 

7. ACAAGGAGQA G6CAGAAGTG AGGTT (<^C) AAACCCACTG CCCAATC- 
TTA 

8. CCAACACQ6T GAAACCCCGT CTGTA(T/C)TAAAAATACA AAAATTAGCC 

9. AATCX:AGGACCCGATAATCTTCCGT(Cn7ATCTAAAACAATA- 
ATGGTGA 

10. CCCAAGGGGG CGAGGGGAGG GTGAA (A/0)GGGTGGGACG 
GGGGCAGCCO 

11. GAAGTGAGAAGGGGGCTGGG 6GTCG (G/-) CGCTGGCTAG 
CGGGCOCGGG 

12. CGCACGCGCA GTATCCC6AT TGGCT (C/GyTGCCCTAGCG GATT- 
GACGGG 

13. AACTCCTGGG TTCGATCAAT ACTCA (GACAA) ATCTTGGCAG 
GCGCAGGAGG 

14. 6CTGGGA7TA CAGOCTTGAG CCACC (A/G) C6CCCGGCCT 
GCAAAGCCAT 

15. TTTTGTATCT TTAGTAGAGA CA6Q (T/Q) TTTCTCCATQ TTGGTCAGGC 

16. GGGTCAGCCT CCCGAGTAGC TGAGACT (C/A) CAGGTGCCCG CCAC- 
CACGCC 
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17, TGAAATTGTA GGTTGAGAG6 CCA66CG {CTT) OGTGCTCACG 
CCTGTAATTT 

18, QTTTATAAAC ATTAAACCAG (T/A) 6CTGTGTGAA 66CACTTAAT 

19, CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 
5 20. G6GAGGCTCG AGGCQ6GC (A/G) GATTGCATGA 6CTCAGGATT 

21. TCCCAAGTTT CAQGGCCCAA (T/G) ATTCTCAAAT CACAGGATTC 

22. TGCAGTGAGC TGAGATCGC (A/G) CCACTGCACT CCAGCCTGGG 

23. TCTTAGGACG CATGGGGGT (T/G) GAGAGAACGG GGAGATAGAC 

24. CTGQGTTCTA GAACTACC (CfT) ATGCAAACCC AQCTGTTTCC 
10 2S. ATTCTGCCCT GGGTTCTAGA ACTACCT (C/A) TGCAAACCCA 

GCTGTTTCCC 

26. GCTG7TT0CC ACCCCATAAG GCA (A/Q) TAGGGGAGCC 
CACCTCCX3CC 

27. GACCTAGAAG ATCGGTCGAQ A (CnT) AQCAGCTTGA GGCTG6CAG6 
15 28. CTGGCCAGGA ATGCAGTCQG GTCAC (CH") CTGTCTAGCC 

ACCGTCTCGC 

20. GGGA6GAGTC GCCGATCAQG (C/T) CCCTTCCTGA AAGTCATCGA 

30. GCAOCCCGGG CTACAQGGTT (A/Q) CCTQAGQTGT GGGTCCCA6G 

31 . TAGAAATACT AACAAA6GGC (TIC) GT66GTTTCT CCCCCTQCTT 
20 32. ACAQGAQAGG QAAGGTTTTTTG (An^ i n 1 1 1 1 U I G 1 1 I I I i i H 

33. GAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GAGCCA^ 
GAAG 

34. GCGCCTCAAC AGCCAGAAGG AGCG (A/G) AGCCTCAGGC CCAGG- 
CAGCT 

25 35. TTGAGACTCT CTGTTTGAT (A/G) CTTCACTCAG AAGGTGCTTC 

38. AGGCCAGGCT CCTGCTGGCT G (C/G) GCTGGTGCAG TGTCTGGQGA 

37. CCCCTATACC CTCAA6CAT (C/T) TATCCATTGA GTTACAAACA 

38. ACCATCX:CCC GCCTTCCGTT (A/G) GTCCGGCCCC CGAGGCTAGC 

30 or to a sequence complementary to any of ttte subsequences. 

18. The method accorcBng to claim 17, wherein at least one nucleotide probe is se- 
lected from the group consisting of 
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1 . TGAAATTGTA GGTTGAGA66 CCAG6C6 (CfT) GGTGCTCACG 
CCTGTAATTT 

2. QTTTATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

3. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 
5 4. GGGAGGCTC6AGGCGGGC(A/G)GATTGCATGAGCTCAGQATT 

6. TCCCAAGTTTCA666CCCAA(T/G)ATTCTCAAATCACAGGATrC 

6. TGCAGTGAGC TGAGATCGC (A/G> CCACTGCACT CCAQCCTGGG 

7. TCTTAGGACG CATGGGGGT (T/G) QAGAGAACGG GGAGATAGAC 

8. CTGGGTTCTAGAACTACC{C/T)ATGCAAACCCAQCTGTTTCC 
10 9. ATTCTGCCCTGGGTTCTAGAACTACCT(C/A)TGCAAACCCA 

GCTGTTTCGC 

10. GCTGTTTCGC ACCCCATAAG GCA (A/G) TAGGGGAGCC 
CACCTCCGCC 

11. GACCTAGAAG ATCGGTCQAG A (C/T) AGCAGCTTGA 6GCT66CAGG 

15 12.CTGGCCAGGAATGCAGTCGGGTCAC(Crr)CTGTCTAGCC 
ACCGTCTCGC 

.13- GSGAGGAGTC GCCGATCAGG (C/T) CCCTTCCTGA AAGTCATCGA 
.14. GCAGCCCGGG CTACAGGGTT (A/6) CCTGAGGTGT GGGTCCCAGG- 
15. TAGAAATACT AACAAA6GGC (T/C) GTGGGTTTCT CCCCCTGCTT 
20 1 6. ACAGGAGAGG GAAGGl II I I IG (AfT) l ll llll i ll G U I I H IN 

17. QAAGAGGAAG AAGCCCAAAG GGA (A/C) AGAAACCTTC GA6CCA> 
6/VAG 

18. GC6CCTCAAC AGCCAGAAGG AGOG (A/G) AGCCTCAGGC CCAGG- 
CAGCT 
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or to a sequence cotnplementaiy to any of the subsequences. 

19. The method according to claim 18. wherein at least one nucleotide probe is se- 
lected from the group consisting of 



1 . GTTT ATAAAC ATTAAACCAG (T/A) GCTGTGTGAA GGCACTTAAT 

2. CCGTCTCTAT TAAAAATATA AAA (A/C) AATTTAGCCG GGTGTAGCGG 

3. GGGAGGCTCG AGGCGGGC (A/G) 6ATTGCATGA GCTCAGGATT 

4. TCCCAAGTTT CAGG6CCCAA (T/G) ATTCTCAAAT CACAGGATTC 
35 5. TGCAGTGAGC TGAGATCGC (/V/G) CCACTGCACT CCAGCCTGGG 
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or to a sequence compiementary to any of the subsequences. 

20. The method according to any of the preceding claims, wherein at least one se- 
quence polymorphtsm Is assessed In a region corresponding to SEQ ID NO: 1 

5 position 1 521-37752 (r). 

21 . The nnethod according to any of the preceding dafme. wherein at least one se- 
quence polymorphism is assessed in a region corresponding to SEQ ID NO: 1 
position 7760-22885 (RAI). 

10 

22. The method according to any of the preceding claims, wherein at least one se- 
quence polymorphism is assessed in a region corresponding to SEQ ID NO: 1 
position 34391- 37752. 

15 23. The method according to any of the preceding claims, wherein at least two diffe- 
rent probes are used, one probe being selected from the probes as defined in 
any of claims 17-21, and the other probe being capable of hybridising to a se- 
quence differsnt from SEQ ID NO: 1, or a part thereof, or to a sequence com- 
plementary to a region different from SEQ ID NO: 1 , or a part thereof,* 

20 

24. The method according to daim 1, wherein the translatfonal product from a se- 
quence In a region con^spondlng to SEQ ID NO: 1, or a part thereof, Is an anti- 
body, such as a monoclonal or polyclonal antibody. 

26 26. A method for estimating the cancer prognosis of an Individual comprising 

- providing a sample from said Individual. 

assessing In the genetic material In said sample a sequence polymorphism 



30 



In a region corresponding to SEQ ID NO: 1. or a part thereof, or 
In a region complementary to SEQ ID NO: 1. or a part thereof, or 
In a transcription product from a sequence In a regton corresponding to 
SEQ ID NO: 1. or a part thereof, or 
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- or translation product from a dec^ence In a region corresponding to SEQ 
ID NO: 1. or a part thereof. 

- obtaining a sequence polymorphism response. 

- estimating the cancer prognosis of said Individual based on the sequence 
po^orphism response. 

26. The method according to claim 25. wherein the method has any of the features 
as defined in any of the claims 2-24. 

27. A method for estimating a treatment response of an Individual suffering from 
canoer to a cancer fieatment» comprising 

• providing a sample from said Individual, 

- assessing In the genetic material In said sample a sequence polymorphism 
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- in a region conesponding to SEQ ID NO: 1 , or a part thereof, or 

- in a region complementary to SEQ ID NO: 1 . or a part thereof, or 

- in a transcription product from a sequence in a region conespohding to 
SEQ ID NO: 1, or a part thereof, or 

- or translation product from a sequence In a region conesponding to SEQ 
ID NO: 1, or a part thereof, 

• obtaining a sequence polymorphism response, 

- estimating the Individual's response to the cancer Irealmant based on the 
sequence poiymoiphism response. 

26. The method according to claim 27, wherein Uie method has any of the features 
30 as defined In any of the claims 2-24. 

29. A primer or probe for use In a method as defined in any of the claims above* 
said primer or probe being selected from 

35 TGGCTAACAC6GTGAAACC(SEQ ID NO:7) 
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GGAATCCAAAdATTCTATOAT6G(SEQ ID NO:8) 

GQGAGGCGGAGCTTGCAGTGA (SEQ ID Na9) 

CTGAGATCQCACCACTGCAC (SEQ ID NO: 10) . 

GGTTTTCTGCTCTGCACACG (SEQ ID NO:1 1) 
6 CCTTTCTCCTTCCACCAAGG(SEQIDNO:12) 

CGGGCTACAGGGTTACCTGAG (SEQ ID NO:13) 

TCTGCAACCT6GTGCGAGCA6C (SEQ ID NO:14) 

CCTACCACCATCATCACATCC (SEQ ID NO:15) 

GCCTTGCCAAAAATCATAACC (SEQ ID N0:16) 
10 CCTCTCCCCAATTAAGTGCCTTCACACAGC (SEQ ID N0:17) 

AGCCAGGGAGGTrGAGGCT(SEQ ID NO:18) 
^ AGACAGCCCTGAATCAGCAC (SEQ ID NO:1 9) 

^ GCAATGAOCCOAOATAGAA (SEQ ID NO:20) 

TGGCTAGCCGATTACTCTA(SEQ ID NO:21) 

15 

30. A primer or probe for use In a method as defined in any of the Claims atxiva as 
the other probe 

GCCCCGTCCCAGGTA (SEQ ID NQ:21 ) 
20 AGCCCCAAOAGCCTTTCACT(SEQIDNO.'22) 

GTCCCATAGATAGGAGTGAAAG (SEQ ID NO'^) 

CCCTAGGACACAGGAGCACA (SEQ ID NO:24) 

TTGTGCTTTCTGTGTGTCCA (SEQ ID NO:25) 

TATCAGAAAAGGCTGGAGGA (SEQ ID NO:26) 
25 GAGTGGCTGGGGAGTAQGA(SEQIDNO:27) 

GCCAAGCAGAAGAGACAAA (SEQ ID N028) 
P CCTCAGATGTCCTCTGCTCA (SEQ ID NO:29) 

GCCACAGCCCCAGCAAGTAG (SEQ ID NO:30) 

AGGACCACAGGACACGGAGA (SEQ ID NO:31) 
30 CATAGAACAGTCCAQAACAC (SEQ ID NO:32) 

rrAGCTTGGCACGGCTGTCCAAQGA (SEQ ID NO:33) 

ACAGAATTCGCCCC6GCCTGGTACAC (SEQ ID NO:34) 

TTGAAACTG6AACTCTGAGAAGG (SEQ ID N0:3S) 

T6GTGGATGGTGTGAAGCA (SEQ ID NO:36) 
35 CCTTTCTGCAACTTCTTCTCCATTTCCACC (SEQ ID NOa7) 
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OGGGATCATGTC6TCAATGGACT (SEQ ID N0:3S) 
AT6CCCTGTAGGTTCAATGG (SEQ ID NO:39} 
T6GAG6TCTTTAGGGGCTTG (SEQ ID NO:40) 
GGCTGGTCCCCGTCTTCTCCTTCC (SEQ ID NO:41) 

5 TCTCTGTTGCCACTTCA6CCTC (SEQ ID NQ:42) 

GTCCTGCCCTCAQCAAAGAGAA (SEQ ID NO:43) 
TTCTCCTGCGATTAAAGGCTQT (SEQ ID NO:44) 
ATCCTGTCCX5TACTGGCCATTC (SEQ ID NO:4S) 
TGTGGACOTGACAGTGAGAAAT (SEQ ID NO:46) 

10 TGGAGTQCTATGGCACGATCTCT (SEQ JD NO:47) 

CCATGGGCATCAAATTCCTGGGA (SEQ ID NO:48) 
CACACCTGGCTCAt i 1 1 IGTAT (SEQ ID NO:49) 
TCATCCAGGTTOTAOATG.CCA (SEQ ID NO:50) 
AGGCTCAACAAGGAAAAATGC (SEQ ID NO:51) 

15 GCTAGACAGTCAA6GAOGGACG(8EQIDNO:S2) 

AAAGGGTGGGTGTGGOAGACATTOG (SEQ ID NO:S3) 
AAACCAAGCTAGGCACCCCAAA (SEQ ID NO:54) 
CAGTGTCCAAAGAGCACC (SEQ ID NO:55) 
CTACCCCTTTAGCGACC (SEQ ID NO:56) 

20 TCCTOCCCGCAGAGCGTCACC(SEQ ID NO:57) 

<?rACGQTCCACATAATTTTGGAGGA (SEQ ID NO:5B} 
CGACGAACTTCTCTGAAGCGAA (SEQ ID NO:59) 
AGCGACACGGGCATCTGG (SEQ ID NO:60) 
ATGAGCGTCCACCTCCTGAACC (SEQ ID N0:61) 

26 AGGCAGCAGCATCGTCATCCCC (SEQ ID NO:e2) 

TGCATAGCTAGGTCCTGG (SEQ ID NO:63) 

AACT6ACRAAACTAGCTCTATGGG0TGGTGCGG0A (SEQ ID NO:64) 
CTGGGTCT6AAACTTACTAGCCC (SEQ ID NO:65) 
GCTQGACTGTCACCGCATG (SEQ ID NO:66) 

30 GGAGCAGGQTTGGCGTG (SEQ ID NO:67) 

TGCCCTCCCAGAGGTAAGGGCT (SEQ ID NO;68} 
(X:CTCCCGGAGOTAAGGCCTC (SEQ ID NO:69) 
GATCAAA6AGACAGACQAOC (SEQ ID NO:70) 
GAAGCCCAGGAAATGC (SEQ ID NO:71) 

35 GGACGCCCACCTGGCCAACC (SEQ ID NO:72) 
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CGTGCT6CCCAACGAAOTG (SEQ ID NO:73} 

31. The primer or probe according to any of clafms 28 or 29, wherein the probe is 
operabiy linked to at least one label, such operably Gnlced to two different la- 



32. The probe according to claim 30. wherein the label Is selected from TEX. TET. 
TAM. ROX. R6G. ORG. HEX. FLU. FAM. DABSYL. Cy7. CyS. Cy3. BOFL, BOF, 
BO-X. BO-TRX. BO-TMR. JOE. 6J0E. VIC. 6FAM;-LCRede40. LCRed705, 
TAMRA, Biotin. DIgoxigenin. OuO-family. Daq-^mlly. 

33. The primer or probe according to any of claims 28-31 , wherein the primer or 
probe Is operably linked to a surface. 

34. The primer or probe according to claim 32. wherein the surfeoe Is the surface of 
microbeads or a DNA chip. 

35. An antibody directed to an epitopa of a RAi gene product. 

36. A kit for use In a method as defined In any of the dalms above, comprising at 
least one primer or probe, said probe being as defined in any of dalms 29-35. 
and opUonaliy further amplifying means for nucleic add amplification. 
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